Commit af45602f authored by Steve Tjoa's avatar Steve Tjoa

consolidating knn

parent 153699e2
{
"metadata": {
"name": "",
"signature": "sha256:070d69dac9c6bf9a97dd8662823256c1857104d1adcfe7ca46a7aea16ad37061"
"signature": "sha256:97e40b9cbaea16ddd001aa4bbf580d8f45ec3ea70b50585d1e2ccd18038071ad"
},
"nbformat": 3,
"nbformat_minor": 0,
......@@ -75,8 +75,7 @@
"source": [
"1. [Spectral Features](notebooks/spectral_features.ipynb)\n",
"1. [Mel-Frequency Cepstral Coefficients](notebooks/mfcc.ipynb)\n",
"1. [K-Nearest Neighbor](notebooks/knn.ipynb)\n",
"1. [Exercise: Instrument Classification using K-NN](exercises/knn_instrument_classification.ipynb)"
"1. [K-Nearest Neighbor Instrument Classification](exercises/knn_instrument_classification.ipynb)"
]
},
{
......
{
"metadata": {
"name": "",
"signature": "sha256:54325d7e3e47937b6b8ae7289c0ca91f6c55e20581cdc44be41e66bf323c0907"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"k-Nearest Neighbour\n",
"-------------------\n",
"\n",
"My first audio classifier: introducing K-NN! \n",
"\n",
"We can now appreciate why we need additional intelligence in our systems - heuristics can't go very far in the world of complex audio signals. We'll be using scikit.learn's implementation of the k-NN for our work here. It proves be a straight-forward and easy to use implementation. The steps and skills of working with one classifier will scale nicely to working with other, more complex classifiers."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Building Models\n",
"\n",
"1. Examine the feature array for the various snare samples. What do you notice? \n",
"\n",
"2. Since the features are different scales, we will want to normalize each feature vector to a common range - storing the scaling coefficients for later use. Many techniques exist for scaling your features. We'll use linear scaling, which forces the features into the range -1 to 1.\n",
"\n",
" For this, we'll use a scikit.learn class called [MinMaxScaler](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html). MinMaxScaler fits and transforms, returning an array of scaled values, and retains coefficients which were used to scale each column into -1 to 1. Use these functions in your code."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sklearn import preprocessing\n",
"import numpy as np\n",
"\n",
"scaler = preprocessing.MinMaxScaler(feature_range = (-1, 1))\n",
"training_features = scaler.fit_transform(np.concatenate(features_snare, features_kick))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'features_snare' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-9-48852d445498>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mscaler\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpreprocessing\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mMinMaxScaler\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfeature_range\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mtraining_features\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mscaler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit_transform\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconcatenate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfeatures_snare\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfeatures_kick\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'features_snare' is not defined"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Build a k-NN model for the snare drums using scikit.learn's [KNeighborsClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html) class."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sklearn.neighbors import KNeighborsClassifier\n",
"\n",
"model_snare = KNeighborsClassifier(n_neighbors = 1)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The labels describe which \"class\" (snare or non-snare, i.e kick, in this case), the features identify. This is formed as an array of ones and twos (class labels) to correspond to the 10 snares and 10 kicks in our training sample set. For example..."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"drum_labels = np.empty(20, np.int32)\n",
"drum_labels[0:9] = 1 # First 10 are the first sample type, e.g. snare\n",
"drum_labels[10:20] = 2 # Second 10 are the second sample type, e.g kick\n",
"drum_labels"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"text": [
"array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2], dtype=int32)"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This k-NN model uses 5 features, 2 classes for output (the label), uses k-NN = 1, and takes in the feature data via a feature array called trainingFeatures."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"model_snare.fit(training_features, drum_labels)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'training_features' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-8-d86eaf5a2bd0>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmodel_snare\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtraining_features\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdrum_labels\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'training_features' is not defined"
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These labels indicate which sample in our feature data is a snare, vs. a non-snare. The k-NN model uses this information to build a means of comparison and classification. It is really important that you get these labels correct - because they are the crux of all future classifications that are made later on. (Trust me, I've made many mistakes in this area - training models with incorrect label data.)\n",
"\n",
"4. Create a script which extracts features for a single file, re-scales its feature values, and evaluates them with your kNN classifier. \n",
"\n",
"### Evaluating samples with your k-NN\n",
"\n",
"Now that the hard part is done, it's time to throw some feature data through the trained k-NN and see what it outputs. \n",
" \n",
"### Rescaling\n",
"\n",
"In evaluating a new audio file, we need to extract it's features, re-scale them to the same range as the trained feature values, and then send them through the k-NN."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# This uses the previous calculated linear scaling parameters to adjust the incoming features to the same range. \n",
"features_scaled = scaler.transform(testing_features)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'features' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-5-3e535d776ccc>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# This uses the previous calculated linear scaling parameters to adjust the incoming features to the same range.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mfeaturesScaled\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mscaler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtransform\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfeatures\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'features' is not defined"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Evaluating with k-NN"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"model_output = model_snare.predict(features_scaled)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'featuresScaled' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-6-5265c345102e>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmodel_output\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmodel_snare\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfeaturesScaled\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'featuresScaled' is not defined"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `model_output` provides a list of whether output is Class 1 or Class 2. If the output labels \"1\" or \"2\" aren't insightful for you, you can add an if statement to display them as strings \"snare\" and \"kick\". Now you can visually compare the output to the training labels.\n",
" \n",
"Once you have completed writing the function, first, test it with your training examples. Since a k-NN model has exact representations of the training data, it will have 100% training accuracy - meaning that every training example should be predicted correctly, when fed back into the trained model. \n",
"\n",
"Now, test out with the examples using https://ccrma.stanford.edu/workshops/mir2014/TestKicksCorpus.txt and https://ccrma.stanford.edu/workshops/mir2014/TestSnaresCorpus.txt. These are real-world testing samples\u2026"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
\ No newline at end of file
{
"metadata": {
"name": "",
"signature": "sha256:eaf3c7682d4b7de77074a247b83c31a5dd8257441eb59f23d9951f0ae352f13e"
"signature": "sha256:d8de82bf99f9c0800bc2de937bab7ce5628a6d0a706db488c63513c6d6a80fcc"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"k-Nearest Neighbour"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"My first audio classifier: introducing K-NN! \n",
"\n",
"We can now appreciate why we need additional intelligence in our systems - heuristics can't go very far in the world of complex audio signals. We'll be using scikit.learn's implementation of the k-NN for our work here. It proves be a straight-forward and easy to use implementation. The steps and skills of working with one classifier will scale nicely to working with other, more complex classifiers."
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Building Models"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Examine the feature array for the various snare samples. What do you notice? \n",
"\n",
"2. Since the features are different scales, we will want to normalize each feature vector to a common range - storing the scaling coefficients for later use. Many techniques exist for scaling your features. We'll use linear scaling, which forces the features into the range -1 to 1.\n",
"\n",
" For this, we'll use a scikit.learn class called [MinMaxScaler](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html). MinMaxScaler fits and transforms, returning an array of scaled values, and retains coefficients which were used to scale each column into -1 to 1. Use these functions in your code."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sklearn import preprocessing\n",
"import numpy as np\n",
"\n",
"scaler = preprocessing.MinMaxScaler(feature_range = (-1, 1))\n",
"training_features = scaler.fit_transform(np.concatenate(features_snare, features_kick))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'features_snare' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-2-48852d445498>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mscaler\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpreprocessing\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mMinMaxScaler\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfeature_range\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mtraining_features\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mscaler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit_transform\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconcatenate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfeatures_snare\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfeatures_kick\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'features_snare' is not defined"
]
}
],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Build a k-NN model for the snare drums using scikit.learn's [KNeighborsClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html) class."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sklearn.neighbors import KNeighborsClassifier\n",
"model_snare = KNeighborsClassifier(n_neighbors = 1)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The labels describe which \"class\" (snare or non-snare, i.e kick, in this case), the features identify. This is formed as an array of ones and twos (class labels) to correspond to the 10 snares and 10 kicks in our training sample set. For example..."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"drum_labels = np.empty(20, np.int32)\n",
"drum_labels[0:9] = 1 # First 10 are the first sample type, e.g. snare\n",
"drum_labels[10:20] = 2 # Second 10 are the second sample type, e.g kick\n",
"drum_labels"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
"array([ 1, 1, 1, 1, 1, 1, 1, 1, 1,\n",
" 32715, 2, 2, 2, 2, 2, 2, 2, 2,\n",
" 2, 2], dtype=int32)"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This k-NN model uses 5 features, 2 classes for output (the label), uses k-NN = 1, and takes in the feature data via a feature array called trainingFeatures."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"model_snare.fit(training_features, drum_labels)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'model_snare' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-4-d86eaf5a2bd0>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmodel_snare\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtraining_features\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdrum_labels\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'model_snare' is not defined"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These labels indicate which sample in our feature data is a snare, vs. a non-snare. The k-NN model uses this information to build a means of comparison and classification. It is really important that you get these labels correct - because they are the crux of all future classifications that are made later on. (Trust me, I've made many mistakes in this area - training models with incorrect label data.)\n",
"\n",
"4. Create a script which extracts features for a single file, re-scales its feature values, and evaluates them with your kNN classifier. \n",
"\n",
"### Evaluating samples with your k-NN\n",
"\n",
"Now that the hard part is done, it's time to throw some feature data through the trained k-NN and see what it outputs. \n",
" \n",
"### Rescaling\n",
"\n",
"In evaluating a new audio file, we need to extract it's features, re-scale them to the same range as the trained feature values, and then send them through the k-NN."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# This uses the previous calculated linear scaling parameters to adjust the incoming features to the same range. \n",
"features_scaled = scaler.transform(testing_features)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'testing_features' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-5-3484e3104f25>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# This uses the previous calculated linear scaling parameters to adjust the incoming features to the same range.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mfeatures_scaled\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mscaler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtransform\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtesting_features\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'testing_features' is not defined"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Evaluating with k-NN"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"model_output = model_snare.predict(features_scaled)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'model_snare' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-6-c597eeb67e8f>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmodel_output\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmodel_snare\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfeatures_scaled\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'model_snare' is not defined"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `model_output` provides a list of whether output is Class 1 or Class 2. If the output labels \"1\" or \"2\" aren't insightful for you, you can add an if statement to display them as strings \"snare\" and \"kick\". Now you can visually compare the output to the training labels.\n",
" \n",
"Once you have completed writing the function, first, test it with your training examples. Since a k-NN model has exact representations of the training data, it will have 100% training accuracy - meaning that every training example should be predicted correctly, when fed back into the trained model. \n",
"\n",
"Now, test out with the examples using https://ccrma.stanford.edu/workshops/mir2014/TestKicksCorpus.txt and https://ccrma.stanford.edu/workshops/mir2014/TestSnaresCorpus.txt. These are real-world testing samples\u2026"
]
},
{
"cell_type": "heading",
"level": 1,
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment