Mel Frequency Cepstral Coefficients (MFCCs)

The mel frequency cepstral coefficients (MFCCs) of a signal are a small set of features (usually about 10-20) which concisely describe the overall shape of a spectral envelope. In MIR, it is often used to describe timbre.

Load Audio

In [16]:
import urllib
url = 'https://ccrma.stanford.edu/workshops/mir2014/audio/simpleLoop.wav'
urllib.urlretrieve(url, filename='simpleLoop.wav')
Out[16]:
('simpleLoop.wav', <httplib.HTTPMessage instance at 0x54db878>)
In [17]:
from essentia.standard import MonoLoader
x = MonoLoader(filename='simpleLoop.wav')()

fs = 44100.0
t = arange(len(x))/fs
plot(t, x)
xlabel('Time (seconds)')
Out[17]:
<matplotlib.text.Text at 0x54fcfd0>
In [18]:
from IPython.display import Audio
Audio(x, rate=fs)
Out[18]:

essentia.standard.MFCC

We will use essentia.standard.MFCC to compute MFCCs across a signal, and we will display them as a "MFCC-gram":

In [19]:
from essentia.standard import MFCC, Spectrum, Windowing, FrameGenerator
hamming_window = Windowing(type='hamming')
spectrum = Spectrum()  # we just want the magnitude spectrum
mfcc = MFCC(numberCoefficients=13)
frame_sz = 1024
hop_sz = 500

mfccs = array([mfcc(spectrum(hamming_window(frame)))[1]
               for frame in FrameGenerator(x, frameSize=frame_sz, hopSize=hop_sz)])
print mfccs.shape
(266, 13)
In [20]:
imshow(mfccs[:,1:].T, origin='lower', aspect='auto', interpolation='nearest') # Ignore the 0th MFCC
yticks(range(12), range(1,13)) # Ignore the 0th MFCC
ylabel('MFCC Coefficient Index')
xlabel('Frame Index')
Out[20]:
<matplotlib.text.Text at 0x5bb1e90>

The very first MFCC, the 0th coefficient, does not convey information relevant to the overall shape. (It only conveys a vertical offset, i.e. adding a constant value to the entire spectrum.) Therefore, we discard the first MFCC when performing classification.

librosa.feature.mfcc

In [21]:
import librosa
mfccs = array([librosa.feature.mfcc(x[i:i+frame_sz], sr=fs, n_mfcc=13)
         for i in range(0, len(x), hop_sz)])
print mfccs.shape
(265, 13)
In [23]:
librosa.display.specshow(mfccs[:,1:].T, sr=fs, hop_length=hop_sz, x_axis='time')
Out[23]:
<matplotlib.image.AxesImage at 0x6d6a550>