genre recognition exercise

cae5a9b0 · Steve Tjoa · a96f4a15 · cae5a9b0 · cae5a9b0
Commit cae5a9b0 authored Feb 13, 2017 by Steve Tjoa
Expand all Show whitespace changes
Inline Side-by-side

Showing with 174 additions and 119 deletions

exercise_genre_recognition.html exercise_genre_recognition.html +92 -63

exercise_genre_recognition.ipynb exercise_genre_recognition.ipynb +82 -56

No files found.
--- a/exercise_genre_recognition.html
+++ b/exercise_genre_recognition.html
--- a/exercise_genre_recognition.ipynb
+++ b/exercise_genre_recognition.ipynb
@@ -11,14 +11,21 @@
    "%matplotlib inline\n",
    "import seaborn\n",
    "import numpy, scipy, matplotlib.pyplot as plt, sklearn, pandas, librosa, urllib, IPython.display, os.path\n",
-    "plt.rcParams['figure.figsize'] = (14,5)"
+    "plt.rcParams['figure.figsize'] = (14, 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Homework Part 2: Genre Classification"
+    "[&larr; Back to Index](index.html)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Exercise: Genre Recognition"
   ]
  },
  {
@@ -34,7 +41,7 @@
   "source": [
    "1. Extract features from an audio signal.\n",
    "2. Train a genre classifier.\n",
-    "3. Use the classifier to classify genre in a song."
+    "3. Use the classifier to classify the genre in a song."
   ]
  },
  {
@@ -59,10 +66,10 @@
   },
   "outputs": [],
   "source": [
-    "filename1 = 'brahms_hungarian_dance_5.mp3'\n",
+    "filename_brahms = 'brahms_hungarian_dance_5.mp3'\n",
-    "url = \"http://audio.musicinformationretrieval.com/\" + filename1\n",
+    "url = \"http://audio.musicinformationretrieval.com/\" + filename_brahms\n",
-    "if not os.path.exists(filename1):\n",
+    "if not os.path.exists(filename_brahms):\n",
-    "    urllib.urlretrieve(url, filename=filename1)"
+    "    urllib.urlretrieve(url, filename=filename_brahms)"
   ]
  },
  {
@@ -91,7 +98,7 @@
   },
   "outputs": [],
   "source": [
-    "x1, fs1 = librosa.load(filename1, duration=120)"
+    "x_brahms, fs_brahms = librosa.load(filename_brahms, duration=120)"
   ]
  },
  {
@@ -109,7 +116,7 @@
   },
   "outputs": [],
   "source": [
-    "plt.plot?"
+    "librosa.display.waveplot?"
   ]
  },
  {
@@ -185,7 +192,15 @@
   },
   "outputs": [],
   "source": [
-    "mfcc1 = librosa.feature.mfcc(x1, sr=fs1, n_mfcc=12).T"
+    "n_mfcc = 12\n",
+    "mfcc_brahms = librosa.feature.mfcc(x_brahms, sr=fs_brahms, n_mfcc=n_mfcc).T"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We transpose the result to accommodate scikit-learn which assumes that each row is one observation, and each column is one feature dimension:"
   ]
  },
  {
@@ -196,7 +211,7 @@
   },
   "outputs": [],
   "source": [
-    "mfcc1.shape"
+    "mfcc_brahms.shape"
   ]
  },
  {
@@ -225,7 +240,7 @@
   },
   "outputs": [],
   "source": [
-    "mfcc1_scaled = scaler.fit_transform(mfcc1)"
+    "mfcc_brahms_scaled = scaler.fit_transform(mfcc_brahms)"
   ]
  },
  {
@@ -243,7 +258,7 @@
   },
   "outputs": [],
   "source": [
-    "mfcc1_scaled.mean(axis=0)"
+    "mfcc_brahms_scaled.mean(axis=0)"
   ]
  },
  {
@@ -254,14 +269,14 @@
   },
   "outputs": [],
   "source": [
-    "mfcc1_scaled.std(axis=0)"
+    "mfcc_brahms_scaled.std(axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "**Repeat steps 1 and 2 for another audio file:**"
+    "### Step 2b: Repeat steps 1 and 2 for another audio file."
   ]
  },
  {
@@ -272,8 +287,8 @@
   },
   "outputs": [],
   "source": [
-    "filename2 = 'busta_rhymes_hits_for_days.mp3'\n",
+    "filename_busta = 'busta_rhymes_hits_for_days.mp3'\n",
-    "url = \"http://audio.musicinformationretrieval.com/\" + filename2"
+    "url = \"http://audio.musicinformationretrieval.com/\" + filename_busta"
   ]
  },
  {
@@ -324,7 +339,8 @@
   },
   "outputs": [],
   "source": [
-    "# Your code here. Load the second audio file in the same manner as the first audio file."
+    "# Your code here. Load the second audio file in the same manner as the first audio file.\n",
+    "# x_busta, fs_busta = "
   ]
  },
  {
@@ -349,7 +365,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Plot the time-domain waveform and spectrogram of the second audio file. In what ways does the time-domain waveform look different than the first audio file? What differences in musical attributes might this reflect? What additional insights are gained from plotting the spectrogram? **Explain.**"
+    "Plot the time-domain waveform and spectrogram of the second audio file. In what ways does the time-domain waveform look different than the first audio file? What differences in musical attributes might this reflect? What additional insights are gained from plotting the spectrogram? Explain."
   ]
  },
  {
@@ -401,25 +417,30 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "**[Please share your answer in this editable text cell.]**"
+    "Extract MFCCs from the second audio file. Be sure to transpose the resulting matrix such that each row is one observation, i.e. one set of MFCCs. Also be sure that the shape and size of the resulting MFCC matrix is equivalent to that for the first audio file."
   ]
  },
  {
-   "cell_type": "markdown",
+   "cell_type": "code",
-   "metadata": {},
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
   "source": [
-    "Extract MFCCs from the second audio file. Be sure to transpose the resulting matrix such that each row is one observation, i.e. one set of MFCCs. Also be sure that the shape and size of the resulting MFCC matrix is equivalent to that for the first audio file."
+    "librosa.feature.mfcc?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
-    "collapsed": false
+    "collapsed": true
   },
   "outputs": [],
   "source": [
-    "librosa.feature.mfcc?"
+    "# Your code here:\n",
+    "# mfcc_busta ="
   ]
  },
  {
@@ -430,7 +451,7 @@
   },
   "outputs": [],
   "source": [
-    "mfcc2.shape"
+    "mfcc_busta.shape"
   ]
  },
  {
@@ -451,6 +472,18 @@
    "scaler.transform?"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# Your code here:\n",
+    "# mfcc_busta_scaled ="
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -466,7 +499,7 @@
   },
   "outputs": [],
   "source": [
-    "mfcc2_scaled.mean?"
+    "mfcc_busta_scaled.mean?"
   ]
  },
  {
@@ -477,7 +510,7 @@
   },
   "outputs": [],
   "source": [
-    "mfcc2_scaled.std?"
+    "mfcc_busta_scaled.std?"
   ]
  },
  {
@@ -502,7 +535,7 @@
   },
   "outputs": [],
   "source": [
-    "features = numpy.vstack((mfcc1_scaled, mfcc2_scaled))"
+    "features = numpy.vstack((mfcc_brahms_scaled, mfcc_busta_scaled))"
   ]
  },
  {
@@ -531,7 +564,7 @@
   },
   "outputs": [],
   "source": [
-    "labels = numpy.concatenate((numpy.zeros(len(mfcc1_scaled)), numpy.ones(len(mfcc2_scaled))))"
+    "labels = numpy.concatenate((numpy.zeros(len(mfcc_brahms_scaled)), numpy.ones(len(mfcc_busta_scaled))))"
   ]
  },
  {
@@ -604,7 +637,7 @@
   },
   "outputs": [],
   "source": [
-    "x1_test, fs1 = librosa.load(filename1, duration=10, offset=120)"
+    "x_brahms_test, fs_brahms = librosa.load(filename_brahms, duration=10, offset=120)"
   ]
  },
  {
@@ -615,7 +648,7 @@
   },
   "outputs": [],
   "source": [
-    "x2_test, fs2 = librosa.load(filename2, duration=10, offset=120)"
+    "x_busta_test, fs_busta = librosa.load(filename_busta, duration=10, offset=120)"
   ]
  },
  {
@@ -806,13 +839,6 @@
    "# Your code here."
   ]
  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "**[Explain your approach in this editable text cell.]**"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -835,7 +861,7 @@
   },
   "outputs": [],
   "source": [
-    "df1 = pandas.DataFrame(mfcc1_test_scaled)"
+    "df_brahms = pandas.DataFrame(mfcc_brahms_test_scaled)"
   ]
  },
  {
@@ -846,7 +872,7 @@
   },
   "outputs": [],
   "source": [
-    "df1.shape"
+    "df_brahms.shape"
   ]
  },
  {
@@ -857,7 +883,7 @@
   },
   "outputs": [],
   "source": [
-    "df1.head()"
+    "df_brahms.head()"
   ]
  },
  {
@@ -868,7 +894,7 @@
   },
   "outputs": [],
   "source": [
-    "df2 = pandas.DataFrame(mfcc2_test_scaled)"
+    "df_busta = pandas.DataFrame(mfcc_busta_test_scaled)"
   ]
  },
  {
@@ -886,7 +912,7 @@
   },
   "outputs": [],
   "source": [
-    "df1.corr()"
+    "df_brahms.corr()"
   ]
  },
  {
@@ -897,14 +923,7 @@
   },
   "outputs": [],
   "source": [
-    "df2.corr()"
+    "df_busta.corr()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "**[Explain your answer in this editable text cell.]**"
   ]
  },
  {
@@ -922,7 +941,7 @@
   },
   "outputs": [],
   "source": [
-    "df1.plot.scatter?"
+    "df_brahms.plot.scatter?"
   ]
  },
  {
@@ -940,7 +959,7 @@
   },
   "outputs": [],
   "source": [
-    "df2.plot.scatter?"
+    "df_busta.plot.scatter?"
   ]
  },
  {
@@ -958,7 +977,7 @@
   },
   "outputs": [],
   "source": [
-    "df1[0].plot.hist()"
+    "df_brahms[0].plot.hist()"
   ]
  },
  {
@@ -969,7 +988,7 @@
   },
   "outputs": [],
   "source": [
-    "df1[11].plot.hist()"
+    "df_busta[11].plot.hist()"
   ]
  },
  {
@@ -999,6 +1018,13 @@
   "source": [
    "Create a new genre classifier by repeating the steps above, but this time use different features. Consult the [librosa documentation on feature extraction](http://librosa.github.io/librosa/feature.html) for different choices of features. Which features work well? not well?"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "[&larr; Back to Index](index.html)"
+   ]
  }
 ],
 "metadata": {