Commit 05569141 authored by Steve Tjoa's avatar Steve Tjoa

real time spectrogram

parent 598cd6da
......@@ -12027,6 +12027,7 @@ div#notebook {
<ol>
<li><a href="pca.html">Principal Component Analysis</a> (<a href="pca.ipynb">ipynb</a>)</li>
<li><a href="nmf.html">Nonnegative Matrix Factorization</a> (<a href="nmf.ipynb">ipynb</a>)</li>
<li><a href="nmf_audio_mosaic.html">NMF Audio Mosaicing</a> (<a href="nmf_audio_mosaic.ipynb">ipynb</a>)</li>
<li><a href="hpss.html">Harmonic-Percussive Source Separation</a> (<a href="hpss.ipynb">ipynb</a>)</li>
</ol>
......@@ -12046,6 +12047,7 @@ div#notebook {
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<ol>
<li><a href="realtime_spectrogram.html">Real-time Spectrogram</a> (<a href="realtime_spectrogram.ipynb">ipynb</a>)</li>
<li><a href="thx_logo_theme.html">THX Logo Theme</a> (<a href="thx_logo_theme.ipynb">ipynb</a>)</li>
</ol>
......
......@@ -204,6 +204,7 @@
"source": [
"1. [Principal Component Analysis](pca.html) ([ipynb](pca.ipynb))\n",
"1. [Nonnegative Matrix Factorization](nmf.html) ([ipynb](nmf.ipynb))\n",
"1. [NMF Audio Mosaicing](nmf_audio_mosaic.html) ([ipynb](nmf_audio_mosaic.ipynb))\n",
"1. [Harmonic-Percussive Source Separation](hpss.html) ([ipynb](hpss.ipynb))"
]
},
......@@ -218,6 +219,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"1. [Real-time Spectrogram](realtime_spectrogram.html) ([ipynb](realtime_spectrogram.ipynb))\n",
"1. [THX Logo Theme](thx_logo_theme.html) ([ipynb](thx_logo_theme.ipynb))"
]
}
......
This source diff could not be displayed because it is too large. You can view the blob instead.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[&larr; Back to Index](index.html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Real-time Spectrogram"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": [
"hide_input"
]
},
"outputs": [
{
"data": {
"image/jpeg": "\n",
"text/html": [
"\n",
" <iframe\n",
" width=\"600\"\n",
" height=\"360\"\n",
" src=\"https://www.youtube.com/embed/ydu12NfZd90\"\n",
" frameborder=\"0\"\n",
" allowfullscreen\n",
" ></iframe>\n",
" "
],
"text/plain": [
"<IPython.lib.display.YouTubeVideo at 0x104fbc908>"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import IPython.display as ipd\n",
"ipd.YouTubeVideo('ydu12NfZd90', width=600, height=360)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is how you can create a real-time spectrogram in your terminal using [PyAudio](https://people.csail.mit.edu/hubert/pyaudio/)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To see the example in action, run the script in this repo, `realtime_spectrogram.py`:\n",
"\n",
" python3 realtime_spectrogram.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The basic idea is simple. For every new audio buffer,\n",
"\n",
"1. Take an FFT, `x_fft`, of the audio buffer.\n",
"2. Compute a `melspectrum` from the `x_fft`.\n",
"2. Print a string, `s`, where `s[i]` is `'*'` wherever `melspectrum[i]` is above a threshold."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From here, you can manipulate this basic example to do more sophisticated real-time processing, e.g. involving machine learning models."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[&larr; Back to Index](index.html)"
]
}
],
"metadata": {
"celltoolbar": "Tags",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
"""
musicinformationretrieval.com/realtime_spectrogram.py
PyAudio example: display a live spectrogram in the terminal.
PyAudio example: display a live log-spectrogram in the terminal.
For more examples using PyAudio:
https://github.com/mwickert/scikit-dsp-comm/blob/master/sk_dsp_comm/pyaudio_helper.py
......@@ -11,30 +11,46 @@ import numpy
import pyaudio
import time
# Define global variables.
CHANNELS = 1
RATE = 44100
FRAMES_PER_BUFFER = 1000
N_FFT = 4096
SCREEN_WIDTH = 178
ENERGY_THRESHOLD = 0.4
BIN_LO = N_FFT*librosa.note_to_hz('C2')/RATE
BIN_HI = N_FFT*librosa.note_to_hz('C7')/RATE
FREQ_BINS = numpy.geomspace(BIN_LO, BIN_HI, SCREEN_WIDTH).astype('int')
# Choose the frequency range of your log-spectrogram.
F_LO = librosa.note_to_hz('C2')
F_HI = librosa.note_to_hz('C9')
M = librosa.filters.mel(RATE, N_FFT, SCREEN_WIDTH, fmin=F_LO, fmax=F_HI)
p = pyaudio.PyAudio()
def generate_string_from_audio(audio_data):
"""
This function takes one audio buffer as a numpy array and returns a
string to be printed to the terminal.
"""
# Compute real FFT.
x_fft = numpy.fft.rfft(audio_data, n=N_FFT)
X = numpy.log10(1 + 0.1*abs(x_fft))
# Compute mel spectrum.
melspectrum = M.dot(abs(x_fft))
# Initialize output characters to display.
char_list = [' ']*SCREEN_WIDTH
for i in range(len(FREQ_BINS)):
b = FREQ_BINS[i]
if X[b] > 0.3:
for i in range(SCREEN_WIDTH):
# If there is energy in this frequency bin, display an asterisk.
if melspectrum[i] > ENERGY_THRESHOLD:
char_list[i] = '*'
# Draw frequency axis guidelines.
elif i % 30 == 29:
char_list[i] = '|'
# Return string.
return ''.join(char_list)
def callback(in_data, frame_count, time_info, status):
......@@ -45,8 +61,8 @@ def callback(in_data, frame_count, time_info, status):
stream = p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=RATE,
input=True,
output=True,
input=True, # Do record input.
output=False, # Do not play back output.
frames_per_buffer=FRAMES_PER_BUFFER,
stream_callback=callback)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment