%matplotlib inline
import seaborn
import numpy, scipy, sklearn, matplotlib.pyplot as plt, IPython.display as ipd, librosa, urllib

Classification of Separated Signals¶

After you have performed signal decomposition (e.g. using librosa.decompose.decompose), try to classify and combine the separated signals.

Download an audio file containing bass drum, snare drum, and hi-hat:

filename = 'Classic Rock Beat 06.wav'
urllib.urlretrieve(
    'http://audio.musicinformationretrieval.com/Jam Pack 1/' + filename,
    filename=filename
)

('Classic Rock Beat 06.wav', <httplib.HTTPMessage instance at 0x10d52cfc8>)

Load the audio file into an array:

x, fs = librosa.load(filename)

Listen to the audio:

ipd.Audio(x, rate=fs)

Signal Decomposition¶

Compute the spectrogram:

X = librosa.stft(x)

Save the magnitude and phase of the spectrogram for later use:

Xabs = numpy.absolute(X)
Xphase = numpy.angle(X)

Perform nonnegative matrix factorization on the signal:

n_components = 30
W, H = librosa.decompose.decompose(Xabs, n_components=n_components, sort=True)
print W.shape
print H.shape

(1025, 30)
(30, 296)

Signal Component Reconstruction¶

Let's create a function that reconstructs a time-domain signal from the NMF outputs:

def reconstruct_signal(w, h, Xphase, length=None):
    Y = scipy.outer(w, h)*numpy.exp(1j*Xphase)
    y = librosa.istft(Y)
    if length:
        y = librosa.util.fix_length(y, length)
    return y

Reconstruct each signal component:

y_signals = numpy.array([reconstruct_signal(W[:,n], H[n], Xphase, length=len(x)) for n in range(n_components)])

Listen to one of the signal components:

ipd.Audio(y_signals[29], rate=fs)

Feature Extraction¶

Compute features for each signal component.

def normalize(h):
    return numpy.array(h)/scipy.linalg.norm(h)

window_sz = 20
hop_sz = 10
features = numpy.array([
        [H[n, i:i+window_sz].mean() for i in range(0, H.shape[1]-window_sz, hop_sz)]
        for n in range(n_components)        
])

features.shape

(30, 28)

plt.plot(H[0])

[<matplotlib.lines.Line2D at 0x114d4d810>]

plt.plot(features[0])

[<matplotlib.lines.Line2D at 0x1149b0290>]

Clustering¶

model = sklearn.cluster.KMeans(n_clusters=3)

labels = model.fit_predict(features)

print labels

[2 2 0 0 1 1 1 0 2 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0]

bass_drum = y_signals[labels==0, :].sum(axis=0)

ipd.Audio(bass_drum, rate=fs)

snare_drum = y_signals[labels==2, :].sum(axis=0)

ipd.Audio(snare_drum, rate=fs)

hi_hat = y_signals[labels==1, :].sum(axis=0)

ipd.Audio(hi_hat, rate=fs)

import numpy as np

labels = np.empty(20, np.int32)
labels[0:9] = 1 # First 10 are the first sample type, e.g. snare
labels[10:20] = 2 # Second 10 are the second sample type, e.g kick

model_snare = KNeighborsClassifier(n_neighbors = 1)
model.fit(scaledTrainingFeatures, labels.take(train_index, 0))
model_output = model_snare.predict(scaledTestingFeatures)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-914682621b17> in <module>()
      5 labels[10:20] = 2 # Second 10 are the second sample type, e.g kick
      6 
----> 7 model_snare = KNeighborsClassifier(n_neighbors = 1)
      8 model.fit(scaledTrainingFeatures, labels.take(train_index, 0))
      9 model_output = model_snare.predict(scaledTestingFeatures)

NameError: name 'KNeighborsClassifier' is not defined

Extract features from the drum signals that you separated in Lab 4 Section 1.
Classify them using the K-NN model that you built.

Does K-NN accurately classify the separated signals?
Repeat for different numbers of separated signals (i.e., the parameter K in NMF).
Overseparate the signal using K = 20 or more. For those separated components that are classified as snare, add them together using `sum}. The listen to the sum signal. Is it coherent, i.e., does it sound like a single separated drum?

...and more!

If you have another idea that you would like to try out, please ask me!
Feel free to collaborate with a partner. Together, brainstorm your own problems, if you want!

Good luck!

← Back to Index