import numpy, scipy, matplotlib.pyplot as plt, sklearn, librosa, urllib, IPython.display
import essentia, essentia.standard as ess
plt.rcParams['figure.figsize'] = (14,4)
Download an audio file:
url = 'http://audio.musicinformationretrieval.com/simple_loop.wav'
urllib.urlretrieve(url, filename='simple_loop.wav')
('simple_loop.wav', <httplib.HTTPMessage instance at 0x10ec5cd88>)
Plot the audio signal:
x, fs = librosa.load('simple_loop.wav')
librosa.display.waveplot(x, sr=fs)
<matplotlib.collections.PolyCollection at 0x10ecb5fd0>
Play the audio:
IPython.display.Audio(x, rate=fs)
librosa.feature.mfcc
¶librosa.feature.mfcc
computes MFCCs across an audio signal:
mfccs = librosa.feature.mfcc(x, sr=fs)
print mfccs.shape
(20, 130)
In this case, mfcc
computed 20 MFCCs over 130 frames.
Display the MFCCs:
librosa.display.specshow(mfccs, sr=fs, x_axis='time')
<matplotlib.image.AxesImage at 0x106318c50>
mfccs = sklearn.preprocessing.scale(mfccs, axis=1)
print mfccs.mean(axis=1)
print mfccs.var(axis=1)
[ -4.64585635e-16 -1.63971401e-16 -1.09314267e-16 -1.09314267e-16 0.00000000e+00 1.09314267e-16 0.00000000e+00 -1.09314267e-16 -1.09314267e-16 -2.73285668e-17 1.09314267e-16 -8.19857003e-17 5.46571335e-17 0.00000000e+00 2.73285668e-17 -4.09928501e-17 1.09314267e-16 8.19857003e-17 9.56499837e-17 6.83214169e-17] [ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
Display the scaled MFCCs:
librosa.display.specshow(mfccs, sr=fs, x_axis='time')
<matplotlib.image.AxesImage at 0x1108b8890>
essentia.standard.MFCC
¶We can also use essentia.standard.MFCC
to compute MFCCs across a signal, and we will display them as a "MFCC-gram":
hamming_window = ess.Windowing(type='hamming')
spectrum = ess.Spectrum() # we just want the magnitude spectrum
mfcc = ess.MFCC(numberCoefficients=13)
frame_sz = 1024
hop_sz = 500
mfccs = numpy.array([mfcc(spectrum(hamming_window(frame)))[1]
for frame in ess.FrameGenerator(x, frameSize=frame_sz, hopSize=hop_sz)])
print mfccs.shape
(134, 13)
Scale the MFCCs:
mfccs = sklearn.preprocessing.scale(mfccs)
/usr/local/lib/python2.7/site-packages/sklearn/preprocessing/data.py:153: UserWarning: Numerical issues were encountered when centering the data and might not be solved. Dataset may contain too large values. You may need to prescale your features. warnings.warn("Numerical issues were encountered " /usr/local/lib/python2.7/site-packages/sklearn/preprocessing/data.py:169: UserWarning: Numerical issues were encountered when scaling the data and might not be solved. The standard deviation of the data is probably very close to 0. warnings.warn("Numerical issues were encountered "
Plot the MFCCs:
plt.imshow(mfccs.T, origin='lower', aspect='auto', interpolation='nearest')
plt.ylabel('MFCC Coefficient Index')
plt.xlabel('Frame Index')
<matplotlib.text.Text at 0x1108cde10>