First version of python corpus retrieval and feature extraction

07be62ec · Leigh Smith · 4f1fe9d1 · 07be62ec · 07be62ec · 07be62ec
Commit 07be62ec authored Jun 20, 2014 by Leigh Smith
Show whitespace changes
Inline Side-by-side

Showing with 166 additions and 52 deletions

classify_separated_signals.ipynb notebooks/classify_separated_signals.ipynb +2 -10

lab1a.ipynb notebooks/lab1a.ipynb +163 -33

lab4.ipynb notebooks/lab4.ipynb +1 -9

No files found.
--- a/notebooks/classify_separated_signals.ipynb
+++ b/notebooks/classify_separated_signals.ipynb
 {
 "metadata": {
  "name": "",
-  "signature": "sha256:98b053589d06e625718d1e1c0a2b65a02f9098813c4b80e061bb5732bb95b293"
+  "signature": "sha256:7ed94cc2b18891285f61c97615ead2cb2f265d2e6a8d305e44280affbff5a673"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
@@ -19,7 +19,7 @@
      "\n",
      "As in Lab 1, extract features from each training sample in the kick and snare drum directories.\n",
      "\n",
-      "1.  Train a K-NN model using the kick and snare drum samples."
+      "Train a K-NN model using the kick and snare drum samples:"
     ]
    },
    {
@@ -59,14 +59,6 @@
      "\n",
      "Good luck!"
     ]
-    },
-    {
-     "cell_type": "code",
-     "collapsed": false,
-     "input": [],
-     "language": "python",
-     "metadata": {},
-     "outputs": []
    }
   ],
   "metadata": {}

--- a/notebooks/lab1a.ipynb
+++ b/notebooks/lab1a.ipynb
 {
 "metadata": {
  "name": "",
-  "signature": "sha256:c5ceab4dd15cd4c672761897b42ca810c7e32f1ca641cc1b0ee141dc0df0a57f"
+  "signature": "sha256:efc8cbbea3ba5f9eea210026cb9f5e9a85fa6b79bca7393a29e27e7810eab0db"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
@@ -15,38 +15,107 @@
      "Section 2: Spectral Features & k-NN\n",
      "------------------------------------\n",
      "\n",
-      "My first audio classifier: introducing K-NN!  We can now appreciate why we need additional intelligence in our systems - heuristics can't very far in the world of complex audio signals.  We'll be using Netlab's implementation of the k-NN for our work here.  It proves be a straight-forward and easy to use implementation.  The steps and skills of working with one classifier will scale nicely to working with other, more complex classifiers. \n",
+      "My first audio classifier: introducing K-NN!  We can now appreciate why we need additional intelligence in our systems - heuristics can't go very far in the world of complex audio signals.  We'll be using scikit.learn's implementation of the k-NN for our work here.  It proves be a straight-forward and easy to use implementation.  The steps and skills of working with one classifier will scale nicely to working with other, more complex classifiers. \n",
      "\n",
      "We're also going to be using the new features in our arsenal: cherishing those \"spectral moments\" (centroid, bandwidth, skewness, kurtosis) and also examining other spectral statistics. \n",
      " \n",
      "### TRAINING DATA\n",
      "\n",
-      "First off, we want to analyze and feature extract a small collection of audio samples - storing their feature data as our \"training data\".  The below commands read all of the .wav files in a directory into a structure, snareFileList.  \n",
+      "First off, we want to analyze and feature extract a small collection of audio samples - storing their feature data as our \"training data\".  The commands below read all of the drum example .wav files from the MIR web site into an array, snareFileList.  \n",
      "\n",
-      "1.  Use these commands to read in a list of filenames (samples) in a directory, replacing the path with the actual directory that the audio \\ drum samples are stored in.\n",
+      "First we define a function to retrieve a list of URLs from a text file."
-      "\n",
+     ]
-      "        snareDirectory = ['/usr/ccrma/courses/mir2013/audio/drum samples/snares/'];\n",
+    },
-      "        snareFileList = getFileNames(snareDirectory ,'wav')\n",
+    {
-      "\n",
+     "cell_type": "code",
-      "        kickDirectory = ['/usr/ccrma/courses/mir2013/audio/drum samples/kicks/'];\n",
+     "collapsed": false,
-      "        kickFileList = getFileNames(kickDirectory ,'wav')\n",
+     "input": [
-      "\n",
+      "import urllib2\n",
-      "2.  To access the filenames contained in the cell array, use the brackets { }  to get to the element that you want to access. \n",
+      "\n",
-      "\n",
+      "def process_corpus(corpus_URL):\n",
-      "    For example, to access the text file name of the 1st file in the list, you would type:\n",
+      "    \"\"\"Read a list of files to process from the text file at corpusURL. Return a list of URLs\"\"\" \n",
-      "\n",
+      "    # Open and read each line\n",
-      "        snareFileList{1}\n",
+      "    url_list_text_data = urllib2.urlopen(corpus_URL) # it's a file like object and works just like a file\n",
+      "    for file_URL in url_list_text_data: # files are iterable\n",
+      "        yield file_URL.rstrip()"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [],
+     "prompt_number": 9
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Use these commands to read in a list of filenames (samples) in a directory, replacing the path with the actual directory that the audio / drum samples are stored in."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "snares_URL = \"https://ccrma.stanford.edu/workshops/mir2014/SnareCorpus.txt\"\n",
+      "snare_file_list = [audio_file_URL for audio_file_URL in process_corpus(snares_URL)]"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [],
+     "prompt_number": 11
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "kicks_URL = \"https://ccrma.stanford.edu/workshops/mir2014/KickCorpus.txt\"\n",
+      "kick_file_list = [audio_file_URL for audio_file_URL in process_corpus(kicks_URL)]"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [],
+     "prompt_number": 12
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "To access the filenames contained in the array, use the square brackets [ ] to get to the element that you want to access. \n",
      "\n",
-      "    When we feature extract a sample collection, we need to sequentially access audio files, segment them (or not), and feature extract them.  Loading a lot of audio files into memory is not always a feasible or desirable operation, so you will create a loop which loads an audio file, feature extracts it, and closes  the audio file.  Note that the only information that we retain in memory are the features that are extracted.\n",
+      "For example, to access the text URL file name of the first file in the list, you would type:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "snare_file_list[0]"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 13,
+       "text": [
+        "'https://ccrma.stanford.edu/workshops/mir2014/audio/drum%20samples/snares/SNARE_01_01.WAV'"
+       ]
+      }
+     ],
+     "prompt_number": 13
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "When we feature extract a sample collection, we need to sequentially access audio files, segment them (or not), and feature extract them.  Loading a lot of audio files into memory is not always a feasible or desirable operation, so you will create a loop which loads an audio file, feature extracts it, and closes  the audio file.  Note that the only information that we retain in memory are the features that are extracted.\n",
      "\n",
-      "3.  Create a loop which reads in an audio file, extracts the zero crossing rate, and some spectral statistics.  The feature information for each audio file (the \"feature vector\") should be stored as a feature array, with columns being the features and rows for each file. \n",
+      "1. Create a loop which reads in an audio file, extracts the zero crossing rate, and some spectral statistics. You can use the \"in\" operator to retrieve each audio file URL from process_corpus(), as used above. The feature information for each audio file (the \"feature vector\") should be stored as a feature array, with columns being the features and rows for each file. \n",
      " \n",
-      "    Or in Matlab, for example:\n",
+      "    for example:\n",
      "\n",
      "        featuresSnare =\n",
      "\n",
-      "            1.0e+003 *\n",
-      "             \n",
      "             0.5730    1.9183    2.9713    0.0004 0.0002\n",
      "             0.4750    1.4834    2.4463    0.0004  0.0012\n",
      "             0.5900    2.2857    3.1788    0.0003  0.0041\n",
@@ -58,23 +127,84 @@
      "             0.5490    2.0137    3.0342    0.0004  0.0016\n",
      "             0.5900    2.2857    3.1788    0.0003  0.0012\n",
      " \n",
-      "    In your loop, here's how to read in your wav files, using a structure of file names:\n",
+      " Within your loop, here's a reminder how to read in your wav files, using an array of audio file URLs:"
-      "      [x,fs]=wavread([snareDirectory snareFileList{i}]);     %note the use of brackets for snareFileList\n",
+     ]
-      "       \n",
+    },
-      "    Here's an example of how to feature extract for the current audio file..\n",
+    {
-      "    frameSize = 0.100 * fs;   % 100ms\n",
+     "cell_type": "code",
-      "    currentFrame = x(1:frameSize)\n",
+     "collapsed": false,
-      "    featuresSnare(i,1)   = zcr(currentFrame);\n",
+     "input": [
-      "    [centroid, bandwidth, skew, kurtosis]=spectralMoments(currentFrame,fs,8192)\n",
+      "from essentia.standard import MonoLoader\n",
-      "               featuresSnare(i,2:5) = [centroid, bandwidth, skew, kurtosis];\n",
+      "file_index = 0\n",
-      "                    \n",
+      "audio = MonoLoader(filename = snare_file_list[file_index])()"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "ename": "RuntimeError",
+       "evalue": "Error while configuring MonoLoader: AudioLoader: Could not open file \"https://ccrma.stanford.edu/workshops/mir2014/audio/drum%20samples/snares/SNARE_01_01.WAV\"",
+       "output_type": "pyerr",
+       "traceback": [
+        "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mRuntimeError\u001b[0m                              Traceback (most recent call last)",
+        "\u001b[0;32m<ipython-input-17-fdc46de435f1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0messentia\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstandard\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mMonoLoader\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      2\u001b[0m \u001b[0mfile_index\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0maudio\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mMonoLoader\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilename\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msnare_file_list\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mfile_index\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+        "\u001b[0;32m/usr/local/lib/python2.7/site-packages/essentia/standard.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, **kwargs)\u001b[0m\n\u001b[1;32m     41\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     42\u001b[0m             \u001b[0;31m# configure the algorithm\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 43\u001b[0;31m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconfigure\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     44\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     45\u001b[0m         \u001b[0;32mdef\u001b[0m \u001b[0mconfigure\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
+        "\u001b[0;32m/usr/local/lib/python2.7/site-packages/essentia/standard.py\u001b[0m in \u001b[0;36mconfigure\u001b[0;34m(self, **kwargs)\u001b[0m\n\u001b[1;32m     56\u001b[0m                 \u001b[0mkwargs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mconvertedVal\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     57\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 58\u001b[0;31m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__configure__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     59\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     60\u001b[0m         \u001b[0;32mdef\u001b[0m \u001b[0mcompute\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
+        "\u001b[0;31mRuntimeError\u001b[0m: Error while configuring MonoLoader: AudioLoader: Could not open file \"https://ccrma.stanford.edu/workshops/mir2014/audio/drum%20samples/snares/SNARE_01_01.WAV\""
+       ]
+      }
+     ],
+     "prompt_number": 17
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Here's an example of how to feature extract the first from for the current audio file..."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      " frameSize = 0.100 * sample_rate   # 100ms\n",
+      " currentFrame = audio[0 : frameSize]\n",
+      " featuresSnare[i, 0] = zcr(currentFrame)\n",
+      " [centroid, bandwidth, skew, kurtosis] = spectralMoments(currentFrame, sample_rate, 8192)\n",
+      " featuresSnare[i, 1:4] = [centroid, bandwidth, skew, kurtosis]"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "ename": "NameError",
+       "evalue": "name 'sample_rate' is not defined",
+       "output_type": "pyerr",
+       "traceback": [
+        "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
+        "\u001b[0;32m<ipython-input-18-4b1c17fe2ac8>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mframeSize\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0.100\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0msample_rate\u001b[0m   \u001b[0;31m# 100ms\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      2\u001b[0m \u001b[0mcurrentFrame\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0maudio\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m \u001b[0;34m:\u001b[0m \u001b[0mframeSize\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[0mfeaturesSnare\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mzcr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcurrentFrame\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      4\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mcentroid\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbandwidth\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mskew\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkurtosis\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mspectralMoments\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcurrentFrame\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msample_rate\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m8192\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0mfeaturesSnare\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mcentroid\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbandwidth\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mskew\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkurtosis\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
+        "\u001b[0;31mNameError\u001b[0m: name 'sample_rate' is not defined"
+       ]
+      }
+     ],
+     "prompt_number": 18
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
      "4.  First, extract all of the feature data for the kick drums and store it in a feature array.  (For my example, above, I'd put it in \"featuresKick\")\n",
      "\n",
      "5.  Next, extract all of the feature data for the snares, storing them in a different array. \n",
      "Again, the kick and snare features should be separated in two different arrays!\n",
      " \n",
-      "    OK, no more help.  The rest is up to you! \n",
+      "OK, no more help.  The rest is up to you!"
-      "\n",
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
      "### Building Models\n",
      "\n",
      "1.  Examine the feature array for the various snare samples.  What do you notice? \n",

--- a/notebooks/lab4.ipynb
+++ b/notebooks/lab4.ipynb
 {
 "metadata": {
  "name": "",
-  "signature": "sha256:eb40e34116f231d0e2616b21dbd3eaac2494c326f9b61a67dc99f2bd3b9cb7ba"
+  "signature": "sha256:6fdc1ab1e764720916a430a54d66054f9d7b5acbd257f42c0315df27b32abc92"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
@@ -126,14 +126,6 @@
      "\n",
      "    Which output contains more noise? Why?\n"
     ]
-    },
-    {
-     "cell_type": "code",
-     "collapsed": false,
-     "input": [],
-     "language": "python",
-     "metadata": {},
-     "outputs": []
    }
   ],
   "metadata": {}