In [1]:
import numpy, scipy, matplotlib.pyplot as plt, sklearn, stanford_mir, IPython.display
%matplotlib inline

Harmonic-Percussive Source Separation

Load a file:

In [2]:
x, fs = librosa.load('1_bar_funk_groove.mp3')

Compute the STFT:

In [3]:
X = librosa.stft(x)

Take the log-ampllitude for display purposes:

In [4]:
Xmag = librosa.logamplitude(X)

Display the log-magnitude spectrogram:

In [5]:
librosa.display.specshow(Xmag, sr=fs, x_axis='time', y_axis='log')
Out[5]:
<matplotlib.image.AxesImage at 0x11301ca50>

Perform harmonic-percussive source separation:

In [6]:
H, P = librosa.decompose.hpss(X)

Compute the log-amplitudes of the outputs:

In [7]:
Hmag = librosa.logamplitude(H)
Pmag = librosa.logamplitude(P)

Display each output:

In [8]:
librosa.display.specshow(Hmag, sr=fs, x_axis='time', y_axis='log')
Out[8]:
<matplotlib.image.AxesImage at 0x113732390>
In [9]:
librosa.display.specshow(Pmag, sr=fs, x_axis='time', y_axis='log')
Out[9]:
<matplotlib.image.AxesImage at 0x113867e50>

Transform the harmonic output back to the time domain:

In [10]:
h = librosa.istft(H)

Listen to the harmonic output:

In [11]:
IPython.display.Audio(h, rate=fs)
Out[11]:

Transform the percussive output back to the time domain:

In [12]:
p = librosa.istft(P)

Listen to the percussive output:

In [13]:
IPython.display.Audio(p, rate=fs)
Out[13]: