Mel Frequency Cepstral Coefficients (MFCCs)¶

The mel frequency cepstral coefficients (MFCCs) of a signal are a small set of features (usually about 10-20) which concisely describe the overall shape of a spectral envelope. In MIR, it is often used to describe timbre.

Load Audio¶

import urllib
url = 'https://ccrma.stanford.edu/workshops/mir2014/audio/simpleLoop.wav'
urllib.urlretrieve(url, filename='simpleLoop.wav')

('simpleLoop.wav', <httplib.HTTPMessage instance at 0x54db878>)

from essentia.standard import MonoLoader
x = MonoLoader(filename='simpleLoop.wav')()

fs = 44100.0
t = arange(len(x))/fs
plot(t, x)
xlabel('Time (seconds)')

<matplotlib.text.Text at 0x54fcfd0>

from IPython.display import Audio
Audio(x, rate=fs)

`essentia.standard.MFCC`¶

We will use essentia.standard.MFCC to compute MFCCs across a signal, and we will display them as a "MFCC-gram":

from essentia.standard import MFCC, Spectrum, Windowing, FrameGenerator
hamming_window = Windowing(type='hamming')
spectrum = Spectrum()  # we just want the magnitude spectrum
mfcc = MFCC(numberCoefficients=13)
frame_sz = 1024
hop_sz = 500

mfccs = array([mfcc(spectrum(hamming_window(frame)))[1]
               for frame in FrameGenerator(x, frameSize=frame_sz, hopSize=hop_sz)])
print mfccs.shape

(266, 13)

imshow(mfccs[:,1:].T, origin='lower', aspect='auto', interpolation='nearest') # Ignore the 0th MFCC
yticks(range(12), range(1,13)) # Ignore the 0th MFCC
ylabel('MFCC Coefficient Index')
xlabel('Frame Index')

<matplotlib.text.Text at 0x5bb1e90>

The very first MFCC, the 0th coefficient, does not convey information relevant to the overall shape. (It only conveys a vertical offset, i.e. adding a constant value to the entire spectrum.) Therefore, we discard the first MFCC when performing classification.

`librosa.feature.mfcc`¶

librosa.feature.mfcc

import librosa
mfccs = array([librosa.feature.mfcc(x[i:i+frame_sz], sr=fs, n_mfcc=13)
         for i in range(0, len(x), hop_sz)])
print mfccs.shape

(265, 13)

librosa.display.specshow(mfccs[:,1:].T, sr=fs, hop_length=hop_sz, x_axis='time')

<matplotlib.image.AxesImage at 0x6d6a550>

Mel Frequency Cepstral Coefficients (MFCCs)¶

Load Audio¶

essentia.standard.MFCC¶

librosa.feature.mfcc¶

`essentia.standard.MFCC`¶

`librosa.feature.mfcc`¶