Commit 0cac6a19 authored by Steve Tjoa's avatar Steve Tjoa

adding nmf exercise

parent 74bf8428
{
"metadata": {
"name": "",
"signature": "sha256:536c517d48c98583d6306b6a68d4f93eed7dc8277f50f50aa7f338f8a2832ad9"
"signature": "sha256:b391827fea99ddd7cb41a8e904df467e6d8cbc2153d90c663a29c59660b604e6"
},
"nbformat": 3,
"nbformat_minor": 0,
......@@ -84,7 +84,7 @@
"level": 2,
"metadata": {},
"source": [
"Day 3: Tonal Features and Unsupervised Classification"
"Day 3: Unsupervised Classification"
]
},
{
......@@ -95,6 +95,22 @@
"1. [Exercise: Unsupervised Instrument Classification using K-Means](exercises/kmeans_instrument_classification.ipynb)"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Day 4: Matrix Factorization"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. [Nonnegative Matrix Factorization](notebooks/nmf.ipynb)\n",
"1. [Exercise: Source Separation using NMF](exercises/nmf_source_separation.ipynb)"
]
},
{
"cell_type": "heading",
"level": 2,
......@@ -112,18 +128,8 @@
"1. [Tonal Descriptors: Pitch and Chroma](notebooks/tonal.ipynb)\n",
"1. [Feature Extraction](notebooks/feature_extraction.ipynb)\n",
"1. [Beat Tracking](notebooks/beat_tracking.ipynb)\n",
"1. [Tempo Estimation](notebooks/tempo_estimation.ipynb)\n",
"1. [Nonnegative Matrix Factorization](notebooks/nmf.ipynb)\n",
"1. [Exercise: Source Separation using NMF](exercises/nmf_source_separation.ipynb)\n"
"1. [Tempo Estimation](notebooks/tempo_estimation.ipynb)\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
......
{
"metadata": {
"name": "",
"signature": "sha256:a92ad530ab974142e47c4d56f0ca34ea9aa24ae9ca7b73a6baa051e8a319a116"
"signature": "sha256:4dadf70135ac19f86b46054f55c35735380ff80be8d92673ea1a71fe1779641e"
},
"nbformat": 3,
"nbformat_minor": 0,
......@@ -20,119 +20,163 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Goals:\n",
"\n",
"Lab 4\n",
"=====\n",
"\n",
"Summary:\n",
"\n",
"1. Separate sources.\n",
"2. Separate noisy sources.\n",
"3. Classify separated sources.\n",
"\n",
"Matlab Programming Tips\n",
"* Pressing the up and down arrows let you scroll through command history.\n",
"* A semicolon at the end of a line simply means ``suppress output''.\n",
"* Type `help <command>` for instant documentation. For example, `help wavread`, `help plot`, `help sound`. Use `help` liberally!\n",
"\n",
"\n",
"Section 1: Source Separation\n",
"----------------------------\n",
"\n",
"1. In Matlab: Select File > Set Path. \n",
"\n",
"Select \"Add with Subfolders\". \n",
"\n",
"Select `/usr/ccrma/courses/mir2011/lab3skt`.\n",
"\n",
"2. As in Lab 1, load the file, listen to it, and plot it.\n",
"\n",
" [x, fs] = wavread('simpleLoop.wav');\n",
" sound(x, fs)\n",
" t = (0:length(x)-1)/fs;\n",
" plot(t, x)\n",
" xlabel('Time (seconds)')\n",
"\n",
"3. Compute and plot a short-time Fourier transform, i.e., the Fourier transform over consecutive frames of the signal.\n",
"\n",
" frame_size = 0.100;\n",
" hop = 0.050;\n",
" X = parsesig(x, fs, frame_size, hop);\n",
" imagesc(abs(X(200:-1:1,:)))\n",
"\n",
" Type `help parsesig`, `help imagesc`, and `help abs` for more information.\n",
"\n",
" This step gives you some visual intuition about how sounds (might) overlap.\n",
"\n",
"4. Let's separate sources!\n",
"\n",
" K = 2;\n",
" [y, W, H] = sourcesep(x, fs, K);\n",
"\n",
" Type `help sourcesep` for more information.\n",
"\n",
"5. Plot and listen to the separated signals.\n",
"\n",
" plot(t, y)\n",
" xlabel('Time (seconds)')\n",
" legend('Signal 1', 'Signal 2')\n",
" sound(y(:,1), fs)\n",
" sound(y(:,2), fs)\n",
"\n",
" Feel free to replace `Signal 1` and `Signal 2` with `Kick` and `Snare` (depending upon which is which). \n",
"\n",
"6. Plot the outputs from NMF.\n",
"\n",
" figure\n",
" plot(W(1:200,:))\n",
" legend('Signal 1', 'Signal 2')\n",
" figure\n",
" plot(H')\n",
" legend('Signal 1', 'Signal 2')\n",
"\n",
" What do you observe from `W` and `H`? \n",
"\n",
" Does it agree with the sounds you heard?\n",
"\n",
"7. Repeat the earlier steps for different audio files.\n",
"\n",
" * `125BOUNC-mono.WAV`\n",
" * `58BPM.WAV` \n",
" * `CongaGroove-mono.wav`\n",
" * `Cstrum chord_mono.wav`\n",
"\n",
" ... and more.\n",
"\n",
"8. Experiment with different values for the number of sources, `K`. \n",
"\n",
" Where does this separation method succeed? \n",
"\n",
" Where does it fail?\n",
"\n",
"\n",
"Section 2: Noise Robustness\n",
"---------------------------\n",
"\n",
"1. Begin with `simpleLoop.wav`. Then try others.\n",
"\n",
" Add noise to the input signal, plot, and listen.\n",
"\n",
" xn = x + 0.01*randn(length(x),1);\n",
" plot(t, xn)\n",
" sound(xn, fs)\n",
"\n",
"2. Separate, plot, and listen.\n",
"\n",
" [yn, Wn, Hn] = sourcesep(xn, fs, K);\n",
" plot(t, yn)\n",
" sound(yn(:,1), fs)\n",
" sound(yn(:,2), fs)\n",
" \n",
" How robust to noise is this separation method? \n",
"\n",
" Compared to the noisy input signal, how much noise is left in the output signals? \n",
"\n",
" Which output contains more noise? Why?\n"
"1. Separate sources using NMF.\n",
"2. Analyze and classify separated sources."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Load an audio file:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import urllib\n",
"from urlparse import urljoin\n",
"filename = 'simpleLoop.wav'\n",
"#filename = 'CongaGroove-mono.wav'\n",
"#filename = '125BOUNC-mono.WAV'\n",
"url = urljoin('https://ccrma.stanford.edu/workshops/mir2014/audio/', filename)\n",
"print url\n",
"urllib.urlretrieve(url, filename=filename)\n",
"\n",
"from essentia.standard import MonoLoader\n",
"fs = 44100.0\n",
"MonoLoader?"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"https://ccrma.stanford.edu/workshops/mir2014/audio/simpleLoop.wav\n"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the spectrogram:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import librosa\n",
"librosa.stft?\n",
"librosa.display.specshow?"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use NMF to decompose the spectrogram:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sklearn.decomposition import NMF\n",
"NMF?"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the decomposed matrices:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the inverse STFT to synthesize the separated sources:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"librosa.istft?"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the columns of the matrix, $W$, otherwise known as *spectral atoms*, as inputs into the kick/snare classifier that you created in an earlier exercise. Observe the results; are you able to automatically classify the separated sources?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Bonus"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use different audio files."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alter the rank of the decomposition, `n_components`. What happens when `n_components` is too large? too small?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"NMF is a useful preprocessor for MIR tasks such as music transcription. Using the steps above, build your own simple transcription system that returns a sequence of note events, `[(onset time, class label, volume/gain)...]`."
]
}
],
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment