In [1]:
%matplotlib inline
import seaborn
import numpy, scipy, matplotlib.pyplot as plt, pandas, librosa

Using Audio in IPython

Audio Libraries

We will mainly use three libraries for audio acquisition and playback:

1. IPython.display.Audio

Introduced in IPython 2.0, IPython.display.Audio lets you play audio directly in an IPython notebook.

2. librosa

librosa is a Python package for music and audio processing by Brian McFee. A large portion was ported from Dan Ellis's Matlab audio processing examples.

3. essentia.standard

Essentia is deprecated in the MIR workshop.

Essentia is an open-source library for audio analysis and music information retrieval from the Music Technology Group at Universitat Pompeu Fabra. Although Essentia is written in C++, we will use the Python bindings for Essentia.

Retrieving Audio

To download a file onto your local machine (or Vagrant box) in Python, you can use urllib.urlretrieve:

In [2]:
import urllib
urllib.urlretrieve(
    'http://audio.musicinformationretrieval.com/simpleLoop.wav', 
    filename='simpleLoop.wav'
)
Out[2]:
('simpleLoop.wav', <httplib.HTTPMessage instance at 0x115fb2d40>)

To check that the file downloaded successfully, list the files in the working directory:

In [3]:
%ls *.wav
125_bounce.wav             bach_s2_m1_perlman_02.wav  conga_groove.wav           prelude_cmaj_10s.wav
58bpm.wav                  bach_s2_m1_perlman_03.wav  noise1.wav                 simpleLoop.wav
bach_s2_m1_perlman_00.wav  bach_s2_m1_perlman_04.wav  noise2.wav                 simple_loop.wav
bach_s2_m1_perlman_01.wav  c_strum.wav                prelude_cmaj.wav           zigeunerweisen.wav

If you only want to listen to, and not manipulate, a remote audio file, use IPython.display.Audio instead. (See Playing Audio.)

Reading Audio

librosa.load

Return both the audio array as well as the sample rate (i.e. sampling frequency):

In [4]:
x, fs = librosa.load('simpleLoop.wav')
print x.shape
print fs
(66150,)
22050

Plot the audio array:

In [5]:
plt.plot(x)
Out[5]:
[<matplotlib.lines.Line2D at 0x116294050>]

essentia.standard.Monoloader

Essentia is deprecated in the MIR workshop.

MonoLoader reads (and downmixes, if necessary) an audio file into a single channel (as will often be the case during this workshop). MonoLoader also resamples the audio to a sampling frequency of your choice (default = 44100 Hz):

In [6]:
# Essentia is deprecated in the MIR workshop.
# from essentia.standard import MonoLoader
# audio = MonoLoader(filename='simpleLoop.wav')()
# audio.shape
# N = len(audio)
# t = numpy.arange(0, N)/44100.0
# plt.plot(t, audio)
# plt.xlabel('Time (seconds)')

For more control over the audio acquisition process, you may want to use AudioLoader instead.

Playing Audio

IPython.display.Audio

Using IPython.display.Audio, you can play a local audio file or a remote audio file:

In [7]:
from IPython.display import Audio
# load a remote WAV file
Audio('https://ccrma.stanford.edu/workshops/mir2014/audio/CongaGroove-mono.wav')
Out[7]:
In [8]:
# load a local WAV file
Audio('simpleLoop.wav')  
Out[8]:

Audio can also accept a NumPy array:

In [9]:
fs = 44100 # sampling frequency
T = 1.5    # seconds
t = numpy.linspace(0, T, int(T*fs), endpoint=False) # time variable
x = numpy.sin(2*numpy.pi*440*t)                # pure sine wave at 440 Hz

# load a NumPy array
Audio(x, rate=fs)
Out[9]:

SoX

To play or record audio from the command line, we recommend SoX (included in the stanford-mir Vagrant box).

$ rec test.wav

$ play test.wav

Visualizing Audio

plot is the simplest way to plot time-domain signals:

In [10]:
T = 0.001    # seconds
fs = 44100   # sampling frequency
t = numpy.linspace(0, T, int(T*fs), endpoint=False) # time variable
x = numpy.sin(2*numpy.pi*3000*t)

# Plot a sine wave
plt.plot(t, x)
plt.xlabel('Time (seconds)')
Out[10]:
<matplotlib.text.Text at 0x1166e1e90>

specgram is a Matplotlib tool for computing and displaying spectrograms.

In [11]:
S, freqs, bins, im = plt.specgram(x, NFFT=1024, Fs=fs, noverlap=512)

# Plot a spectrogram
plt.xlabel('Time')
plt.ylabel('Frequency')
Out[11]:
<matplotlib.text.Text at 0x117232790>

Writing Audio

librosa.output.write_wav

librosa.output.write_wav also saves a NumPy array to a WAV file. This is a bit easier to use.

In [12]:
noise = 0.1*scipy.randn(44100)

# Write an array to a wav file
librosa.output.write_wav('noise2.wav', noise, 44100)
%ls *.wav
125_bounce.wav             bach_s2_m1_perlman_02.wav  conga_groove.wav           prelude_cmaj_10s.wav
58bpm.wav                  bach_s2_m1_perlman_03.wav  noise1.wav                 simpleLoop.wav
bach_s2_m1_perlman_00.wav  bach_s2_m1_perlman_04.wav  noise2.wav                 simple_loop.wav
bach_s2_m1_perlman_01.wav  c_strum.wav                prelude_cmaj.wav           zigeunerweisen.wav