"import numpy, scipy, scipy.spatial, matplotlib.pyplot as plt\n",
"plt.rcParams['figure.figsize'] = (14, 2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[← Back to Index](index.html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dynamic Time Warping "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(Credit for some of the ideas in this notebook come from [FastDTW](https://github.com/slaypni/fastdtw/blob/master/fastdtw.py).)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dynamic time warping (DTW) is an algorithm used to align two sequences of similar content but possibly different lengths. For example, you might want to align two different performances of the same musical work. These two signals, $x$ and $y$, may have similar sequences of chord progressions and instrumentations, but there may be timing deviations between the two.\n",
"\n",
"Given two sequences, $x[n], n \\in \\{0, ..., N_x - 1\\}$, and $y[n], n \\in \\{0, ..., N_y - 1\\}$, DTW produces a set of index pairs $\\{ (i, j) ... \\}$ such that $x[i]$ and $y[j]$ are similar."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create two arrays, $x$ and $y$, of lengths $N_x$ and $N_y$, respectively."
"In this simple example, there is only one value or \"feature\" at each time index. However, in practice, you can use sequences of *vectors*, e.g. spectrograms, chromagrams, or MFCC-grams."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But for now, we will add an empty dimension to the array because DTW expects a vector at each time index -- in this simple example, a \"one-dimensional vector\", i.e. a scalar."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(8, 1)\n",
"(9, 1)\n"
]
}
],
"source": [
"x = scipy.expand_dims(x, axis=1)\n",
"y = scipy.expand_dims(y, axis=1)\n",
"print x.shape\n",
"print y.shape"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[3]\n",
" [3]\n",
" [1]\n",
" [4]\n",
" [6]\n",
" [1]\n",
" [5]\n",
" [5]]\n"
]
}
],
"source": [
"print x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 1: Compute pairwise distances"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Compute a matrix of pairwise distances between elements of $x$ and $y$."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(8, 9)\n"
]
}
],
"source": [
"S = scipy.spatial.distance_matrix(x, y)\n",
"print S.shape"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 1. 1. 2. 0. 0. 1. 2. 1. 2.]\n",
" [ 1. 1. 2. 0. 0. 1. 2. 1. 2.]\n",
" [ 3. 1. 0. 2. 2. 3. 0. 3. 4.]\n",
" [ 0. 2. 3. 1. 1. 0. 3. 0. 1.]\n",
" [ 2. 4. 5. 3. 3. 2. 5. 2. 1.]\n",
" [ 3. 1. 0. 2. 2. 3. 0. 3. 4.]\n",
" [ 1. 3. 4. 2. 2. 1. 4. 1. 0.]\n",
" [ 1. 3. 4. 2. 2. 1. 4. 1. 0.]]\n"
]
}
],
"source": [
"print S"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Element $S[i, j]$ indicates the distance between $x[i]$ and $y[j]$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The time complexity of this operation is $O(N_x N_y)$. The space complexity is $O(N_x N_y)$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 2: Compute cumulative distances"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The basic idea of DTW is to find a path from the top left to the bottom right of matrix $S$ such that the sum of distances along the path is minimized. After all, you want to find pairs of time indices $(i, j)$ such that $x[i]$ and $y[j]$ is always low."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will compute a matrix of cumulative path distances, $D$, such that for any pair of time indices $(i, j)$, $D[i, j]$ is the sum of the best path from $(0, 0)$ to $(i, j)$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's compute the sum of the best path from $(0, 0)$ to $(N_x-1, 0)$."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 1. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 2. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 5. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 5. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 7. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 10. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 11. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 12. 0. 0. 0. 0. 0. 0. 0. 0.]]\n"
]
}
],
"source": [
"D = scipy.zeros_like(S)\n",
"D[0, 0] = S[0, 0]\n",
"for i in range(1, Nx):\n",
" D[i, 0] = D[i-1, 0] + S[i, 0]\n",
"print D"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, compute the sum of the best path from $(0, 0)$ to $(0, N_y)$."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 1. 2. 4. 4. 4. 5. 7. 8. 10.]\n",
" [ 2. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 5. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 5. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 7. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 10. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 11. 0. 0. 0. 0. 0. 0. 0. 0.]\n",
" [ 12. 0. 0. 0. 0. 0. 0. 0. 0.]]\n"
]
}
],
"source": [
"for j in range(1, len(y)):\n",
" D[0, j] = D[0, j-1] + S[0, j]\n",
"print D"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, compute the sums of the best paths to any other pair of coordinates, $(i, j)$.\n",
"\n",
"The path constraint is that, at $(i, j)$, the valid steps are $(i+1, j)$, $(i, j+1)$, and $(i+1, j+1)$. In other words, the alignment always moves forward in time for at least one of the signals. It never goes backward in time. (You can adapt this to your application.)"
"Here is a function that combines all the steps above. (Note that this is a naive version of the DTW algorithm. There are many optimizations that can reduce the time and space complexity below $O(N_x N_y)$.)"