Commit 0cac6a19 authored by Steve Tjoa's avatar Steve Tjoa

adding nmf exercise

parent 74bf8428
{ {
"metadata": { "metadata": {
"name": "", "name": "",
"signature": "sha256:536c517d48c98583d6306b6a68d4f93eed7dc8277f50f50aa7f338f8a2832ad9" "signature": "sha256:b391827fea99ddd7cb41a8e904df467e6d8cbc2153d90c663a29c59660b604e6"
}, },
"nbformat": 3, "nbformat": 3,
"nbformat_minor": 0, "nbformat_minor": 0,
...@@ -84,7 +84,7 @@ ...@@ -84,7 +84,7 @@
"level": 2, "level": 2,
"metadata": {}, "metadata": {},
"source": [ "source": [
"Day 3: Tonal Features and Unsupervised Classification" "Day 3: Unsupervised Classification"
] ]
}, },
{ {
...@@ -95,6 +95,22 @@ ...@@ -95,6 +95,22 @@
"1. [Exercise: Unsupervised Instrument Classification using K-Means](exercises/kmeans_instrument_classification.ipynb)" "1. [Exercise: Unsupervised Instrument Classification using K-Means](exercises/kmeans_instrument_classification.ipynb)"
] ]
}, },
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Day 4: Matrix Factorization"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. [Nonnegative Matrix Factorization](notebooks/nmf.ipynb)\n",
"1. [Exercise: Source Separation using NMF](exercises/nmf_source_separation.ipynb)"
]
},
{ {
"cell_type": "heading", "cell_type": "heading",
"level": 2, "level": 2,
...@@ -112,18 +128,8 @@ ...@@ -112,18 +128,8 @@
"1. [Tonal Descriptors: Pitch and Chroma](notebooks/tonal.ipynb)\n", "1. [Tonal Descriptors: Pitch and Chroma](notebooks/tonal.ipynb)\n",
"1. [Feature Extraction](notebooks/feature_extraction.ipynb)\n", "1. [Feature Extraction](notebooks/feature_extraction.ipynb)\n",
"1. [Beat Tracking](notebooks/beat_tracking.ipynb)\n", "1. [Beat Tracking](notebooks/beat_tracking.ipynb)\n",
"1. [Tempo Estimation](notebooks/tempo_estimation.ipynb)\n", "1. [Tempo Estimation](notebooks/tempo_estimation.ipynb)\n"
"1. [Nonnegative Matrix Factorization](notebooks/nmf.ipynb)\n",
"1. [Exercise: Source Separation using NMF](exercises/nmf_source_separation.ipynb)\n"
] ]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
} }
], ],
"metadata": {} "metadata": {}
......
{ {
"metadata": { "metadata": {
"name": "", "name": "",
"signature": "sha256:a92ad530ab974142e47c4d56f0ca34ea9aa24ae9ca7b73a6baa051e8a319a116" "signature": "sha256:4dadf70135ac19f86b46054f55c35735380ff80be8d92673ea1a71fe1779641e"
}, },
"nbformat": 3, "nbformat": 3,
"nbformat_minor": 0, "nbformat_minor": 0,
...@@ -20,119 +20,163 @@ ...@@ -20,119 +20,163 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Goals:\n",
"\n", "\n",
"Lab 4\n", "1. Separate sources using NMF.\n",
"=====\n", "2. Analyze and classify separated sources."
"\n", ]
"Summary:\n", },
"\n", {
"1. Separate sources.\n", "cell_type": "markdown",
"2. Separate noisy sources.\n", "metadata": {},
"3. Classify separated sources.\n", "source": [
"\n", "Load an audio file:"
"Matlab Programming Tips\n", ]
"* Pressing the up and down arrows let you scroll through command history.\n", },
"* A semicolon at the end of a line simply means ``suppress output''.\n", {
"* Type `help <command>` for instant documentation. For example, `help wavread`, `help plot`, `help sound`. Use `help` liberally!\n", "cell_type": "code",
"\n", "collapsed": false,
"\n", "input": [
"Section 1: Source Separation\n", "import urllib\n",
"----------------------------\n", "from urlparse import urljoin\n",
"\n", "filename = 'simpleLoop.wav'\n",
"1. In Matlab: Select File > Set Path. \n", "#filename = 'CongaGroove-mono.wav'\n",
"\n", "#filename = '125BOUNC-mono.WAV'\n",
"Select \"Add with Subfolders\". \n", "url = urljoin('https://ccrma.stanford.edu/workshops/mir2014/audio/', filename)\n",
"\n", "print url\n",
"Select `/usr/ccrma/courses/mir2011/lab3skt`.\n", "urllib.urlretrieve(url, filename=filename)\n",
"\n", "\n",
"2. As in Lab 1, load the file, listen to it, and plot it.\n", "from essentia.standard import MonoLoader\n",
"\n", "fs = 44100.0\n",
" [x, fs] = wavread('simpleLoop.wav');\n", "MonoLoader?"
" sound(x, fs)\n", ],
" t = (0:length(x)-1)/fs;\n", "language": "python",
" plot(t, x)\n", "metadata": {},
" xlabel('Time (seconds)')\n", "outputs": [
"\n", {
"3. Compute and plot a short-time Fourier transform, i.e., the Fourier transform over consecutive frames of the signal.\n", "output_type": "stream",
"\n", "stream": "stdout",
" frame_size = 0.100;\n", "text": [
" hop = 0.050;\n", "https://ccrma.stanford.edu/workshops/mir2014/audio/simpleLoop.wav\n"
" X = parsesig(x, fs, frame_size, hop);\n", ]
" imagesc(abs(X(200:-1:1,:)))\n", }
"\n", ],
" Type `help parsesig`, `help imagesc`, and `help abs` for more information.\n", "prompt_number": 3
"\n", },
" This step gives you some visual intuition about how sounds (might) overlap.\n", {
"\n", "cell_type": "markdown",
"4. Let's separate sources!\n", "metadata": {},
"\n", "source": [
" K = 2;\n", "Plot the spectrogram:"
" [y, W, H] = sourcesep(x, fs, K);\n", ]
"\n", },
" Type `help sourcesep` for more information.\n", {
"\n", "cell_type": "code",
"5. Plot and listen to the separated signals.\n", "collapsed": false,
"\n", "input": [
" plot(t, y)\n", "import librosa\n",
" xlabel('Time (seconds)')\n", "librosa.stft?\n",
" legend('Signal 1', 'Signal 2')\n", "librosa.display.specshow?"
" sound(y(:,1), fs)\n", ],
" sound(y(:,2), fs)\n", "language": "python",
"\n", "metadata": {},
" Feel free to replace `Signal 1` and `Signal 2` with `Kick` and `Snare` (depending upon which is which). \n", "outputs": [],
"\n", "prompt_number": 4
"6. Plot the outputs from NMF.\n", },
"\n", {
" figure\n", "cell_type": "markdown",
" plot(W(1:200,:))\n", "metadata": {},
" legend('Signal 1', 'Signal 2')\n", "source": [
" figure\n", "Use NMF to decompose the spectrogram:"
" plot(H')\n", ]
" legend('Signal 1', 'Signal 2')\n", },
"\n", {
" What do you observe from `W` and `H`? \n", "cell_type": "code",
"\n", "collapsed": false,
" Does it agree with the sounds you heard?\n", "input": [
"\n", "from sklearn.decomposition import NMF\n",
"7. Repeat the earlier steps for different audio files.\n", "NMF?"
"\n", ],
" * `125BOUNC-mono.WAV`\n", "language": "python",
" * `58BPM.WAV` \n", "metadata": {},
" * `CongaGroove-mono.wav`\n", "outputs": [],
" * `Cstrum chord_mono.wav`\n", "prompt_number": 5
"\n", },
" ... and more.\n", {
"\n", "cell_type": "markdown",
"8. Experiment with different values for the number of sources, `K`. \n", "metadata": {},
"\n", "source": [
" Where does this separation method succeed? \n", "Plot the decomposed matrices:"
"\n", ]
" Where does it fail?\n", },
"\n", {
"\n", "cell_type": "code",
"Section 2: Noise Robustness\n", "collapsed": false,
"---------------------------\n", "input": [],
"\n", "language": "python",
"1. Begin with `simpleLoop.wav`. Then try others.\n", "metadata": {},
"\n", "outputs": []
" Add noise to the input signal, plot, and listen.\n", },
"\n", {
" xn = x + 0.01*randn(length(x),1);\n", "cell_type": "markdown",
" plot(t, xn)\n", "metadata": {},
" sound(xn, fs)\n", "source": [
"\n", "Use the inverse STFT to synthesize the separated sources:"
"2. Separate, plot, and listen.\n", ]
"\n", },
" [yn, Wn, Hn] = sourcesep(xn, fs, K);\n", {
" plot(t, yn)\n", "cell_type": "code",
" sound(yn(:,1), fs)\n", "collapsed": false,
" sound(yn(:,2), fs)\n", "input": [
" \n", "librosa.istft?"
" How robust to noise is this separation method? \n", ],
"\n", "language": "python",
" Compared to the noisy input signal, how much noise is left in the output signals? \n", "metadata": {},
"\n", "outputs": [],
" Which output contains more noise? Why?\n" "prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the columns of the matrix, $W$, otherwise known as *spectral atoms*, as inputs into the kick/snare classifier that you created in an earlier exercise. Observe the results; are you able to automatically classify the separated sources?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Bonus"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use different audio files."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alter the rank of the decomposition, `n_components`. What happens when `n_components` is too large? too small?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"NMF is a useful preprocessor for MIR tasks such as music transcription. Using the steps above, build your own simple transcription system that returns a sequence of note events, `[(onset time, class label, volume/gain)...]`."
] ]
} }
], ],
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment