Commit 249c91cd authored by Steve Tjoa's avatar Steve Tjoa

initial commit: old workshop material

parent 112c18c1
This diff is collapsed.
This diff is collapsed.
{
"metadata": {
"name": "",
"signature": "sha256:417a167478ba4dfbc3a0ee4785668d8f8953045aa65b6a192f488670ef2b691d"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lab 2\n",
"=====\n",
"\n",
"Purpose: To gain an understanding of feature extraction, windowing, MFCCs.\n",
"\n",
"SECTION 1 SEGMENTING INTO EVERY N ms FRAMES\n",
"-------------------------------------------\n",
"\n",
"Segmenting: Chopping up into frames every N seconds\n",
"\n",
"Previously, we've either chopped up a signal by the location of it's onsets (and taking the following 100 ms) or just analyzing the entire file. \n",
"Analyzing the audio file by \"frames\" is another technique for your arsenal that is good for analyzing entire songs, phrases, or non-onset-based audio examples.\n",
"You easily chop up the audio into frames every, say, 100ms, with a for loop. \n",
"\n",
" frameSize = 0.100 * fs; % 100ms\n",
" for i = 1: frameSize : (length(x)-frameSize+1) \n",
" currentFrame = x(i:i+frameSize-1); % this is the current audio frame \n",
" % Now, do your feature extraction here and store the features in some matrix / array\n",
" end\n",
"\n",
"Very often, you will want to have some overlap between the audio frames - taking an 100ms long frame but sliding it 50 ms each time. To do a 100ms frame and have it with 50% overlap, try: \n",
"\n",
" frameSize = 0.100 * fs; % 100ms\n",
" hop = 0.5; % 50%overlap\n",
" for i = 1: hop * frameSize : (length(x)-frameSize+1) \n",
" ...\n",
" end\n",
"\n",
"Note that it's also important to multiple the signal by a window (e.g., Hamming / Hann window) equal to the frame size to smoothly transition between the frames. \n",
"\n",
"SECTION 2 MFCC\n",
"--------------\n",
"\n",
"Load an audio file of your choosing from the audio folder on `/usr/ccrma/courses/mir2012/audio`.\n",
"Use this as an opportunity to explore this collection.\n",
"\n",
"BAG OF FRAMES\n",
"\n",
"Test out MFCC to make sure that you know how to call it. We'll use the CATbox implementation of MFCC.\n",
"\n",
" currentFrameIndex = 1; \n",
" for i = 1: frameSize : (length(x)-frameSize+1)\n",
" currentFrame = x(i:i+frameSize-1) + eps ; % this is the current audio frame\n",
" % Note that we add EPS to prevent divide by 0 errors % Now, do your other feature extraction here \n",
" % The code generates MFCC coefficients for the audio signal given in the current frame.\n",
" [mfceps] = mfcc(currentFrame ,fs)' ; %note the transpose operator!\n",
" delta_mfceps = mfceps - [zeros(1,size(mfceps,2)); mfceps(1:end-1,:)]; %first delta\n",
" % Calculate the mean and std of the MFCCs, MFCC-deltas.\n",
" MFCC_mean(currentFrameIndex,:) = mean(mfceps);\n",
" MFCC_std(currentFrameIndex,:) = std(mfceps);\n",
" MFCC_delta_mean (currentFrameIndex,:)= mean(delta_mfceps);\n",
" MFCC_delta_std(currentFrameIndex,:)= std(delta_mfceps);\n",
" currentFrameIndex = currentFrameIndex + 1;\n",
" end\n",
"\n",
" features = [MFCC_mean MFCC_delta_mean ]; % In this case, we'll only store the MFCC and delta-MFCC means\n",
" % NOTE: You might want to toss out the FIRST MFCC coefficient and delta-coefficient since it's much larger than \n",
" others and only describes the total energy of the signal.\n",
"\n",
"You can include this code inside of your frame-hopping loop to extract the MFCC-values for each frame. \n",
"\n",
"Once MFCCs per frame have been calculated, consider how they can be used as features for expanding the k-NN classification and try implementing it!\n",
"\n",
"Extract the mean of the 12 MFCCs (coefficients 1-12, do not use the \"0th\" coefficient) for each onset using the code that you wrote. Add those to the feature vectors, along with zero crossing and centroid. We should now have 14 features being extracted - this is starting to get \"real world\"! With this simple example (and limited collection of audio slices, you probably won't notice a difference - but at least it didn't break, right?) Try it with the some other audio to truly appreciate the power of timbral classification. \n",
"\n",
"\n",
"\n",
"SECTION 3 CROSS VALIDATION\n",
"--------------------------\n",
"\n",
"You'll need some of this code and information to calculate your accuracy rate on your classifiers.\n",
"\n",
"EXAMPLE\n",
"\n",
"Let's say we have 10-fold cross validation...\n",
"\n",
"1. Divide test set into 10 random subsets.\n",
"2. 1 test set is tested using the classifier trained on the remaining 9.\n",
"3. We then do test/train on all of the other sets and average the percentages. \n",
"\n",
"To achieve the first step (divide our training set into k disjoint subsets), use the function crossvalind.m (posted in the Utilities)\n",
"\n",
" INDICES = CROSSVALIND('Kfold',N,K) returns randomly generated indices\n",
" for a K-fold cross-validation of N observations. INDICES contains equal\n",
" (or approximately equal) proportions of the integers 1 through K that\n",
" define a partition of the N observations into K disjoint subsets.\n",
"\n",
" You can type help crossvalind to look at all the other options. This code is also posted as a template in \n",
" `/usr/ccrma/courses/mir2010/Toolboxes/crossValidation.m`\n",
"\n",
" % This code is provided as a template for your cross-validation\n",
" % computation. Replace the variables \"features\", \"labels\" with your own\n",
" % data. \n",
" % As well, you can replace the code in the \"BUILD\" and \"EVALUATE\" sections\n",
" % to be useful with other types of Classifiers.\n",
" %\n",
" %% CROSS VALIDATION \n",
" numFolds = 10; % how many cross-validation folds do you want - (default=10)\n",
" numInstances = size(features,1); % this is the total number of instances in our training set\n",
" numFeatures = size(features,2); % this is the total number of instances in our training set\n",
" indices = crossvalind('Kfold',numInstances,numFolds) % divide test set into 10 random subsets\n",
" clear errors\n",
" for i = 1:10\n",
" % SEGMENT DATA INTO FOLDS\n",
" disp(['fold: ' num2str(i)]) \n",
" test = (indices == i) ; % which points are in the test set\n",
" train = ~test; % all points that are NOT in the test set\n",
" % SCALE\n",
" [trainingFeatures,mf,sf]=scale(features(train,:));\n",
" % BUILD NEW MODEL - ADD YOUR MODEL BUILDING CODE HERE...\n",
" model = knn(numFeatures,2,3,trainingFeatures,labels(train,:)); \n",
" % RESCALE TEST DATA TO TRAINING SCALE SPACE\n",
" [testingFeatures]=rescale(features(test,:),mf,sf);\n",
" % EVALUATE WITH TEST DATA - ADD YOUR MODEL EVALUATION CODE HERE\n",
" [voting,model_output] = knnfwd(model ,testingFeatures);\n",
" % CONVERT labels(test,:) LABELS TO SAME FORMAT TO COMPUTE ERROR \n",
" labels_test = zeros(size(model_output,1),1); % create array of 0s\n",
" labels_test(find(labels(test,1)==1))=1; % convert column 1 to class 1 \n",
" labels_test(find(labels(test,2)==1))=2; % convert column 2 to class 2 \n",
" % COUNT ERRORS \n",
" errors(i) = mean ( model_output ~= labels_test )\n",
" end\n",
" disp(['cross validation error: ' num2str(mean(errors))])\n",
" disp(['cross validation accuracy: ' num2str(1-mean(errors))])"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
\ No newline at end of file
This diff is collapsed.
{
"metadata": {
"name": "",
"signature": "sha256:8cb92bce0fc8cd2698239681515f405f57245c6def0ac2620d221a3f49fcad54"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"Lab 4\n",
"=====\n",
"\n",
"Summary:\n",
"\n",
"1. Separate sources.\n",
"2. Separate noisy sources.\n",
"3. Classify separated sources.\n",
"\n",
"Matlab Programming Tips\n",
"* Pressing the up and down arrows let you scroll through command history.\n",
"* A semicolon at the end of a line simply means ``suppress output''.\n",
"* Type `help <command>` for instant documentation. For example, `help wavread`, `help plot`, `help sound`. Use `help` liberally!\n",
"\n",
"\n",
"Section 1: Source Separation\n",
"----------------------------\n",
"\n",
"1. In Matlab: Select File > Set Path. \n",
"\n",
"Select \"Add with Subfolders\". \n",
"\n",
"Select `/usr/ccrma/courses/mir2011/lab3skt`.\n",
"\n",
"2. As in Lab 1, load the file, listen to it, and plot it.\n",
"\n",
" [x, fs] = wavread('simpleLoop.wav');\n",
" sound(x, fs)\n",
" t = (0:length(x)-1)/fs;\n",
" plot(t, x)\n",
" xlabel('Time (seconds)')\n",
"\n",
"3. Compute and plot a short-time Fourier transform, i.e., the Fourier transform over consecutive frames of the signal.\n",
"\n",
" frame_size = 0.100;\n",
" hop = 0.050;\n",
" X = parsesig(x, fs, frame_size, hop);\n",
" imagesc(abs(X(200:-1:1,:)))\n",
"\n",
" Type `help parsesig`, `help imagesc`, and `help abs` for more information.\n",
"\n",
" This step gives you some visual intuition about how sounds (might) overlap.\n",
"\n",
"4. Let's separate sources!\n",
"\n",
" K = 2;\n",
" [y, W, H] = sourcesep(x, fs, K);\n",
"\n",
" Type `help sourcesep` for more information.\n",
"\n",
"5. Plot and listen to the separated signals.\n",
"\n",
" plot(t, y)\n",
" xlabel('Time (seconds)')\n",
" legend('Signal 1', 'Signal 2')\n",
" sound(y(:,1), fs)\n",
" sound(y(:,2), fs)\n",
"\n",
" Feel free to replace `Signal 1` and `Signal 2` with `Kick` and `Snare` (depending upon which is which). \n",
"\n",
"6. Plot the outputs from NMF.\n",
"\n",
" figure\n",
" plot(W(1:200,:))\n",
" legend('Signal 1', 'Signal 2')\n",
" figure\n",
" plot(H')\n",
" legend('Signal 1', 'Signal 2')\n",
"\n",
" What do you observe from `W` and `H`? \n",
"\n",
" Does it agree with the sounds you heard?\n",
"\n",
"7. Repeat the earlier steps for different audio files.\n",
"\n",
" * `125BOUNC-mono.WAV`\n",
" * `58BPM.WAV` \n",
" * `CongaGroove-mono.wav`\n",
" * `Cstrum chord_mono.wav`\n",
"\n",
" ... and more.\n",
"\n",
"8. Experiment with different values for the number of sources, `K`. \n",
"\n",
" Where does this separation method succeed? \n",
"\n",
" Where does it fail?\n",
"\n",
"\n",
"Section 2: Noise Robustness\n",
"---------------------------\n",
"\n",
"1. Begin with `simpleLoop.wav`. Then try others.\n",
"\n",
" Add noise to the input signal, plot, and listen.\n",
"\n",
" xn = x + 0.01*randn(length(x),1);\n",
" plot(t, xn)\n",
" sound(xn, fs)\n",
"\n",
"2. Separate, plot, and listen.\n",
"\n",
" [yn, Wn, Hn] = sourcesep(xn, fs, K);\n",
" plot(t, yn)\n",
" sound(yn(:,1), fs)\n",
" sound(yn(:,2), fs)\n",
" \n",
" How robust to noise is this separation method? \n",
"\n",
" Compared to the noisy input signal, how much noise is left in the output signals? \n",
"\n",
" Which output contains more noise? Why?\n",
"\n",
"\n",
"Section 3: Classification\n",
"-------------------------\n",
"\n",
"Follow the K-NN example in Lab 1, but classify the *separated* signals.\n",
"\n",
"As in Lab 1, extract features from each training sample in the kick and snare drum directories.\n",
"\n",
"1. Train a K-NN model using the kick and snare drum samples.\n",
"\n",
" labels=[[ones(10,1) zeros(10,1)];\n",
" [zeros(10,1) ones(10,1)]];\n",
" model_snare = knn(5, 2, 1, trainingFeatures, labels);\n",
" [voting, model_output] = knnfwd(model_snare, featuresScaled)\n",
"\n",
"2. Extract features from the drum signals that you separated in Lab 4 Section 1. \n",
"\n",
"3. Classify them using the K-NN model that you built.\n",
"\n",
" Does K-NN accurately classify the separated signals?\n",
"\n",
"4. Repeat for different numbers of separated signals (i.e., the parameter `K` in NMF). \n",
"\n",
"5. Overseparate the signal using `K = 20` or more. For those separated components that are classified as snare, add them together using `sum}. The listen to the sum signal. Is it coherent, i.e., does it sound like a single separated drum?\n",
"\n",
"...and more!\n",
"\n",
"* If you have another idea that you would like to try out, please ask me!\n",
"* Feel free to collaborate with a partner. Together, brainstorm your own problems, if you want!\n",
"\n",
"Good luck!\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment