Commit cae5a9b0 authored by Steve Tjoa's avatar Steve Tjoa

genre recognition exercise

parent a96f4a15
This diff is collapsed.
...@@ -11,14 +11,21 @@ ...@@ -11,14 +11,21 @@
"%matplotlib inline\n", "%matplotlib inline\n",
"import seaborn\n", "import seaborn\n",
"import numpy, scipy, matplotlib.pyplot as plt, sklearn, pandas, librosa, urllib, IPython.display, os.path\n", "import numpy, scipy, matplotlib.pyplot as plt, sklearn, pandas, librosa, urllib, IPython.display, os.path\n",
"plt.rcParams['figure.figsize'] = (14,5)" "plt.rcParams['figure.figsize'] = (14, 5)"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Homework Part 2: Genre Classification" "[← Back to Index](index.html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exercise: Genre Recognition"
] ]
}, },
{ {
...@@ -34,7 +41,7 @@ ...@@ -34,7 +41,7 @@
"source": [ "source": [
"1. Extract features from an audio signal.\n", "1. Extract features from an audio signal.\n",
"2. Train a genre classifier.\n", "2. Train a genre classifier.\n",
"3. Use the classifier to classify genre in a song." "3. Use the classifier to classify the genre in a song."
] ]
}, },
{ {
...@@ -59,10 +66,10 @@ ...@@ -59,10 +66,10 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"filename1 = 'brahms_hungarian_dance_5.mp3'\n", "filename_brahms = 'brahms_hungarian_dance_5.mp3'\n",
"url = \"http://audio.musicinformationretrieval.com/\" + filename1\n", "url = \"http://audio.musicinformationretrieval.com/\" + filename_brahms\n",
"if not os.path.exists(filename1):\n", "if not os.path.exists(filename_brahms):\n",
" urllib.urlretrieve(url, filename=filename1)" " urllib.urlretrieve(url, filename=filename_brahms)"
] ]
}, },
{ {
...@@ -91,7 +98,7 @@ ...@@ -91,7 +98,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"x1, fs1 = librosa.load(filename1, duration=120)" "x_brahms, fs_brahms = librosa.load(filename_brahms, duration=120)"
] ]
}, },
{ {
...@@ -109,7 +116,7 @@ ...@@ -109,7 +116,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"plt.plot?" "librosa.display.waveplot?"
] ]
}, },
{ {
...@@ -185,7 +192,15 @@ ...@@ -185,7 +192,15 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc1 = librosa.feature.mfcc(x1, sr=fs1, n_mfcc=12).T" "n_mfcc = 12\n",
"mfcc_brahms = librosa.feature.mfcc(x_brahms, sr=fs_brahms, n_mfcc=n_mfcc).T"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We transpose the result to accommodate scikit-learn which assumes that each row is one observation, and each column is one feature dimension:"
] ]
}, },
{ {
...@@ -196,7 +211,7 @@ ...@@ -196,7 +211,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc1.shape" "mfcc_brahms.shape"
] ]
}, },
{ {
...@@ -225,7 +240,7 @@ ...@@ -225,7 +240,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc1_scaled = scaler.fit_transform(mfcc1)" "mfcc_brahms_scaled = scaler.fit_transform(mfcc_brahms)"
] ]
}, },
{ {
...@@ -243,7 +258,7 @@ ...@@ -243,7 +258,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc1_scaled.mean(axis=0)" "mfcc_brahms_scaled.mean(axis=0)"
] ]
}, },
{ {
...@@ -254,14 +269,14 @@ ...@@ -254,14 +269,14 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc1_scaled.std(axis=0)" "mfcc_brahms_scaled.std(axis=0)"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"**Repeat steps 1 and 2 for another audio file:**" "### Step 2b: Repeat steps 1 and 2 for another audio file."
] ]
}, },
{ {
...@@ -272,8 +287,8 @@ ...@@ -272,8 +287,8 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"filename2 = 'busta_rhymes_hits_for_days.mp3'\n", "filename_busta = 'busta_rhymes_hits_for_days.mp3'\n",
"url = \"http://audio.musicinformationretrieval.com/\" + filename2" "url = \"http://audio.musicinformationretrieval.com/\" + filename_busta"
] ]
}, },
{ {
...@@ -324,7 +339,8 @@ ...@@ -324,7 +339,8 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"# Your code here. Load the second audio file in the same manner as the first audio file." "# Your code here. Load the second audio file in the same manner as the first audio file.\n",
"# x_busta, fs_busta = "
] ]
}, },
{ {
...@@ -349,7 +365,7 @@ ...@@ -349,7 +365,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Plot the time-domain waveform and spectrogram of the second audio file. In what ways does the time-domain waveform look different than the first audio file? What differences in musical attributes might this reflect? What additional insights are gained from plotting the spectrogram? **Explain.**" "Plot the time-domain waveform and spectrogram of the second audio file. In what ways does the time-domain waveform look different than the first audio file? What differences in musical attributes might this reflect? What additional insights are gained from plotting the spectrogram? Explain."
] ]
}, },
{ {
...@@ -401,25 +417,30 @@ ...@@ -401,25 +417,30 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"**[Please share your answer in this editable text cell.]**" "Extract MFCCs from the second audio file. Be sure to transpose the resulting matrix such that each row is one observation, i.e. one set of MFCCs. Also be sure that the shape and size of the resulting MFCC matrix is equivalent to that for the first audio file."
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "code",
"metadata": {}, "execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [ "source": [
"Extract MFCCs from the second audio file. Be sure to transpose the resulting matrix such that each row is one observation, i.e. one set of MFCCs. Also be sure that the shape and size of the resulting MFCC matrix is equivalent to that for the first audio file." "librosa.feature.mfcc?"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": { "metadata": {
"collapsed": false "collapsed": true
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"librosa.feature.mfcc?" "# Your code here:\n",
"# mfcc_busta ="
] ]
}, },
{ {
...@@ -430,7 +451,7 @@ ...@@ -430,7 +451,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc2.shape" "mfcc_busta.shape"
] ]
}, },
{ {
...@@ -451,6 +472,18 @@ ...@@ -451,6 +472,18 @@
"scaler.transform?" "scaler.transform?"
] ]
}, },
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Your code here:\n",
"# mfcc_busta_scaled ="
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
...@@ -466,7 +499,7 @@ ...@@ -466,7 +499,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc2_scaled.mean?" "mfcc_busta_scaled.mean?"
] ]
}, },
{ {
...@@ -477,7 +510,7 @@ ...@@ -477,7 +510,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"mfcc2_scaled.std?" "mfcc_busta_scaled.std?"
] ]
}, },
{ {
...@@ -502,7 +535,7 @@ ...@@ -502,7 +535,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"features = numpy.vstack((mfcc1_scaled, mfcc2_scaled))" "features = numpy.vstack((mfcc_brahms_scaled, mfcc_busta_scaled))"
] ]
}, },
{ {
...@@ -531,7 +564,7 @@ ...@@ -531,7 +564,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"labels = numpy.concatenate((numpy.zeros(len(mfcc1_scaled)), numpy.ones(len(mfcc2_scaled))))" "labels = numpy.concatenate((numpy.zeros(len(mfcc_brahms_scaled)), numpy.ones(len(mfcc_busta_scaled))))"
] ]
}, },
{ {
...@@ -604,7 +637,7 @@ ...@@ -604,7 +637,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"x1_test, fs1 = librosa.load(filename1, duration=10, offset=120)" "x_brahms_test, fs_brahms = librosa.load(filename_brahms, duration=10, offset=120)"
] ]
}, },
{ {
...@@ -615,7 +648,7 @@ ...@@ -615,7 +648,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"x2_test, fs2 = librosa.load(filename2, duration=10, offset=120)" "x_busta_test, fs_busta = librosa.load(filename_busta, duration=10, offset=120)"
] ]
}, },
{ {
...@@ -806,13 +839,6 @@ ...@@ -806,13 +839,6 @@
"# Your code here." "# Your code here."
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**[Explain your approach in this editable text cell.]**"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
...@@ -835,7 +861,7 @@ ...@@ -835,7 +861,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df1 = pandas.DataFrame(mfcc1_test_scaled)" "df_brahms = pandas.DataFrame(mfcc_brahms_test_scaled)"
] ]
}, },
{ {
...@@ -846,7 +872,7 @@ ...@@ -846,7 +872,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df1.shape" "df_brahms.shape"
] ]
}, },
{ {
...@@ -857,7 +883,7 @@ ...@@ -857,7 +883,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df1.head()" "df_brahms.head()"
] ]
}, },
{ {
...@@ -868,7 +894,7 @@ ...@@ -868,7 +894,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df2 = pandas.DataFrame(mfcc2_test_scaled)" "df_busta = pandas.DataFrame(mfcc_busta_test_scaled)"
] ]
}, },
{ {
...@@ -886,7 +912,7 @@ ...@@ -886,7 +912,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df1.corr()" "df_brahms.corr()"
] ]
}, },
{ {
...@@ -897,14 +923,7 @@ ...@@ -897,14 +923,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df2.corr()" "df_busta.corr()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**[Explain your answer in this editable text cell.]**"
] ]
}, },
{ {
...@@ -922,7 +941,7 @@ ...@@ -922,7 +941,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df1.plot.scatter?" "df_brahms.plot.scatter?"
] ]
}, },
{ {
...@@ -940,7 +959,7 @@ ...@@ -940,7 +959,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df2.plot.scatter?" "df_busta.plot.scatter?"
] ]
}, },
{ {
...@@ -958,7 +977,7 @@ ...@@ -958,7 +977,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df1[0].plot.hist()" "df_brahms[0].plot.hist()"
] ]
}, },
{ {
...@@ -969,7 +988,7 @@ ...@@ -969,7 +988,7 @@
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"df1[11].plot.hist()" "df_busta[11].plot.hist()"
] ]
}, },
{ {
...@@ -999,6 +1018,13 @@ ...@@ -999,6 +1018,13 @@
"source": [ "source": [
"Create a new genre classifier by repeating the steps above, but this time use different features. Consult the [librosa documentation on feature extraction](http://librosa.github.io/librosa/feature.html) for different choices of features. Which features work well? not well?" "Create a new genre classifier by repeating the steps above, but this time use different features. Consult the [librosa documentation on feature extraction](http://librosa.github.io/librosa/feature.html) for different choices of features. Which features work well? not well?"
] ]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[← Back to Index](index.html)"
]
} }
], ],
"metadata": { "metadata": {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment