Creating Tempic Integrations

A polytempic accelerando

A polytempic accelerando

Early American Experimental Music

American composer Charles Ives (1874-1954) wrote many experimental pieces where he would selectively disregard certain 'rules' of music theory, and explore the sonic results. In an essay on his experimental music, Ives asked this question:

If you can play a tune in one key, why can't a feller, if he feels like, play one in two keys?

Ives' early polytonal experiments include pieces with a melody in one musical key and accompaniment in another key, and pieces with separate voices, each in separate keys. These experimental pieces were not published or performed until long after he wrote them, but Ives' ideas did appear in his less systematic compositions. More importantly his ideas influenced a generation of younger composers.

Among these younger composers was Henry Cowell (1897-1965), who in 1930 published New Musical Resources, a collection of largely unexplored musical ideas. The book extends the notion of multiple simultaneous musical keys polytonal music to multiple simultaneous tempos, polytempic music.

Can we make music with multiple simultaneous tempos? Certainly. But Cowell took the idea a step further. Assume we have multiple simultaneous tempos and these tempos are accelerating or decelerating relative to each other. In New Musical Resources, Cowell called this practice sliding tempo, and he noted that this would not be easy to do. In 1930, he writes:

For practical purposes, care would have to be exercised in the use of sliding tempo, in order to control the relation between tones in a sliding part with those in another part being played at the same time: a composer would have to know in other words, what tones in a part with rising tempo would be struck simultaneously with other tones in a part of, say, fixed tempo, and this from considerations of harmony. There would usually be no absolute coincidence, but the tones which would be struck at approximately the same time could be calculated.

Phase Music

Can we calculate the times of coincidence like Cowell suggests? How? Can a musician even perform polytempic music with tempo accelerations or decelerations? Over the course of the 20th century, musicians explored this idea in what is now called phase music (wikipedia). One early example is Piano Phase (1967) by Steve Reich (wikipedia, vimeo).

Through experimentation, Steve Reich found that if the music is reasonably simple, two performers can make synchronized tempo adjustments relative to each other well enough to yield compelling results. In Piano Phase two pianos both play the same repeating 12 note musical pattern. One musician plays at a slightly faster rate than the other. It is possible for a musician to play this, but there is a limit to complexity of the patterns that a musician can play and a limit to the number of simultaneous changing tempi a musician can comprehend. Piano Phase approaches the limit of what a musician can perform. Can we not use electronic sequencers or digital audio tools to synthesize complex phase music with many simultaneous tempic transitions?

Most phase music (including Piano Phase) layers identical copies of musical content playing at slightly different speeds. The layers begin in-phase with with each other, and gradually drift out of phase. Reich's Piano Phase is no exception. As the layers drift out of phase, our listening attention is drawn to variations caused by shifting phase relationships between the layers. In the example below, the two layers start out playing together at 90 beats-per-minute (BPM), at \(t=0\) the top layer abruptly starts playing at 100 BPM.

Abrupt tempo change

Abrupt tempo change

If we wait long enough, eventually the layers will re-align, and and the whole pattern will start from the beginning again. But what if we want to control exactly when the layers re-synchronize? Can we specify precisely both the time and the phase of sync points without abrupt changes in tempo? How can we sequence patterns like the one below exactly? This is the central question of Tempic Integrations.

Continuous tempo acceleration from 90 BPM to 120 BPM

Continuous tempo acceleration from 90 BPM to 120 BPM

Tempic Integrations

If we want to specify the exact number of beats that elapse between the beginning of a tempo acceleration and the end point of the acceleration, and we want to hear this precisely we need a mathematical model for shaping our tempo acceleration curves. The goal is to compose and audition music where:

  • Swarms of an arbitrary number of simultaneous tempi coexist.
  • Each individual player within the swarm can continuously accelerate or decelerate individually, but also as a member of a cohesive whole.
  • Each musical line can converge and diverge at explicit points. At each point of convergence the phase of the meter within the tempo can be set.

We start by defining a single tempo transition. Consider the following example from the image above:

  • Assume we have 2 snare drum players. Both begin playing the same beat at 90 BPM in common time.
  • One performer gradually accelerates relative to the other. We want to define a continuous tempo curve such that one drummer accelerates to 120 BPM.
  • So far, we can easily accomplish this with a simple linear tempo acceleration. However, we want the tempo transition to complete exactly when both drummers are on a down-beat, so the combined effect is a 3 over 4 rhythmic pattern. Linear acceleration results in the transition completing at an arbitrary phase.
  • We want the accelerating drummer to reach the new tempo after exactly 20 beats.
  • We also want the acceleration to complete in exactly 16 beats of the original tempo, so that the drummer playing a constant tempo and the accelerating drummer are playing together.

We are interested in both the number of beats elapsed in the static tempo and in the changing tempo, as well as the absolute tempo. If we think of the number of beats elapsed as our position, and the tempo as our rate, we see how this resembles a physics problem. If we have a function that describes our tempo (or rate), we can integrate that function, and the result will tell us our number of beats elapsed (or position). Given the above considerations, our tempo curve is defined in terms of 5 constants:

  • Time \(t_0=0\), when the tempo transition begins
  • A known time, \(t_1\), when the tempo transition ends
  • A known starting tempo: \(\dot{x}_0\)
  • A known finishing tempo: \(\dot{x}_1\)
  • The number of beats elapsed in the changing tempo between \(t_0\) and \(t_1\): \(x_1\)

The tension of the tempo curve determines how many beats elapse during the transition period. The curve is well-defined for some starting acceleration \(a_0\) and finishing acceleration \(a_1\), so we define the curve in terms of linear acceleration. Using Newtonian notation we can describe our tempo acceleration as:

\begin{equation} \ddot{x}_1 = a_0 + a_1t_1 \end{equation}

Integrating linear acceleration (above) yields a quadratic velocity curve (below). The velocity curve describes the tempo (in beats per minute) with respect to time.

\begin{equation} \dot{x}_1 = \dot{x}_0 + a_0t_1 + \frac{a_1t_1^2}{2} \end{equation}

Integrating velocity (above) gives us a function describing position (the number of beats elapsed with respect to time, below).

\begin{equation} x_1 = x_0 + \dot{x}_0t_1 + \frac{a_0t_1^2}{2} + \frac{a_1t_1^3}{6} \end{equation}

With equations for \(\dot{x}_1\) and \(x_1\), we can solve for our two unknowns, \(a_0\) and \(a_1\). First we solve both equations for \(a_1\):

\begin{equation} a_1 = \frac{-2}{t_1^2}(\dot{x}_0-\dot{x}_1 + a_0t_1) = \frac{-6}{t_1^3}(\dot{x}_0t_1-x_1 + \frac{a_0t_1^2}{2}) \end{equation}

Assuming \(t_1 \neq 0\), we solve this system of equations for \(a_0\):

\begin{equation} a_0=\frac{6x_1-2t_1(\dot{x}_1+2\dot{x}_0)}{t_1^2} \end{equation}

Evaluating for \(a_0\) with our constants gives us our starting acceleration. Once we have \(a_0\) we can solve our velocity equation to find the value of \(a_1\). Plugging \(a_1\) and \(a_0\) into the velocity equation will give us velocity curve that produces a continuous BPM curve over the course of the tempo transition.

Note: We must specify the same time units for input variables like \(t_1\) and \(\dot{x}_1\). I prefer minutes for \(t_1\) and beats per minute for \(\dot{x}_1\) over seconds and beats per second.

Calculating Fibonacci Numbers with Matrices and Linear Algebra

In this post we solve the Fibonacci sequence using linear algebra. The Fibonacci equation is a second-order difference equation which is a particular type of sequence.

Sequences

A sequence is a (possibly infinite) set of numbers with a defined order.

\[ a_n = \frac{1}{n}, \textit{ for } n = 0, 1, 2, ..., \]

Difference Equations

A difference equation is a type of sequence. A difference equation expresses the value of the sequence relative to the value of another point in the sequence.

\[ x_{n+1} = 1.5x_n, \textit{ for } n = 0, 1, 2, ..., \]

To evaluate this equation at \(n=3\) we need to know the initial value \(x_0\), and evaluate \(n=1, 2, 3\) in succession. For example, if \(x_0=100\):

\begin{align} x_0 & && = 100 \\ x_1 & = 1.5 \times 100 && = 150\\ x_2 & = 1.5 \times 150 && = 225\\ x_3 & = 1.5 \times 225 && = 337.5 \end{align}

In the example above, we evaluate the equation recursively. It would be better to be able to evaluate the equation at \(n = 100\) without performing \(100\) calculations. Note that we can also write the equation above in this form:

\[ x_3 = 100 \times 1.5 \times 1.5 \times 1.5 = 337.5 \]

Solving a difference equation means writing the same equation such that it can be evaluated at a given \(n\) without first evaluating \(n=1,2,etc\). The solution to the equation above is:

\[ x_{n} = 1.5^{n} \times x_0, \textit{ for } n = 0, 1, 2, ..., \]

Second Order Difference Equations

The previous example is a first order difference equation because it expresses the value of \(x_n\) as a function of one other sample, \(x_{n-1}\). A second order difference equation is a function of two other samples, for example

\[ x_{n+2} = 1.5x_n + 1.4x_{n+1}, \textit{ for } n = 0, 1, 2, ..., \]

Fibonacci Numbers

The easiest way to describe the Fibonacci sequence is with a second order difference equation. Assume \(F_0 = 0\) and \(F_1 = 1\)

\[ F_{k+2} = F_{k+1} + F_{k} \]

The first eight Fibonacci numbers are:

\[ F_k=0,1,1,2,3,5,8,13,... \]

We can solve the sequence with linear algebra. Let us start our sequence with a vector, \(u_0=\begin{bmatrix}1\\0\end{bmatrix}\). Find the next vector in our sequence, \(u_1\) like this:

\[ u_{1} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

More generally:

\[ u_{k+1} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} u_k \]

Note that \(u_0\) contains \(\begin{bmatrix}F_1\\F_0\end{bmatrix}\).

The key to solving the sequence is in the eigenvalues and eigenvectors of \(\begin{bmatrix}1&1\\1&0\end{bmatrix}\). We can write \(u_3\) as

\[ u_3 = \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 \\ 0\end{bmatrix} = \begin{bmatrix}3 \\ 2\end{bmatrix} = \begin{bmatrix}F_4 \\ F_3\end{bmatrix} \]

import numpy as np
fib = np.matrix([[1,1],[1,0]])
u_0 = np.matrix([[1],[0]])

fib * fib * fib * u_0 # The * operator is dot product for np.matrix (but not np.array)
matrix([[3],
        [2]])

Eigenvector Solution

The general case of the previous example is:

\[ \begin{align} u_k &= \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix}^k\begin{bmatrix}1 \\ 0\end{bmatrix}\\ u_k &= A^ku_0 \end{align} \]

We can use eigenvectors and eigenvalues to solve the matrix power (see Eigenvalues and Eigenvectors). First, find the eigenvalues of \(A\):

\[ \begin{align} det\begin{vmatrix} 1 - \lambda & 1\\ 1 & - \lambda \end{vmatrix} & = 0\\ -\lambda(1-\lambda) - 1 & = 0\\ \lambda{}^2 - \lambda - 1 & = 0 \end{align} \]

Solve using the quadratic formula \(\frac{-b \pm \sqrt{b^2-4ac}}{2a} \implies \frac{1 \pm \sqrt{5}}{2} \approx \frac{1 \pm 2.236}{2}\\\), which gives the eigenvalues for our matrix: \[ \begin{align} \lambda_1 & \approx &1.618 \\ \lambda_2 & \approx &-0.618 \end{align} \]

Now we plug our constant eigenvalues (\(\lambda_1\) and \(\lambda_2\)) into \((A-\lambda I)\vec{x}=0\) to find the eigen vectors.

\[ \begin{bmatrix}1-\lambda_1 & 1 \\ 1 & -\lambda_1 \end{bmatrix}\vec{x}_1 = 0 \implies \vec{x}_1 = \begin{bmatrix}\lambda_1 \\ 1\end{bmatrix} \]

Note that this solution is not necessarily intuitive. This is not a valid solution for any arbitrary value of \(\lambda\): our values, \(\lambda_1\) and \(\lambda_2\) are valid solutions. We can use the same process to solve for \(\lambda_2\), revealing eigenvectors \(\begin{bmatrix}\lambda_1 \\1\end{bmatrix}\) and \(\begin{bmatrix}\lambda_2 \\1\end{bmatrix}\). Now create the eigenvector matrix, \(S\) and eigenvalue matrix, \(\Lambda\).

\[ \begin{align} S &= \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix}\\ \Lambda &= \begin{bmatrix}\lambda_1 & 0 \\ 0 & \lambda_2\end{bmatrix} \end{align} \]

Remeber that \(A^k = S^{-1}\Lambda^kS\). To proceed, invert \(S\). Remember this shortcut for inverting a \(2 \times 2\) matrix.

\[ \begin{align} \begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} =&& \frac{1}{ad-cb} &\begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \\ S^{-1} =&& &\begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix}^{-1} \\ S^{-1} =&& \frac{1}{\lambda_1-\lambda_2} &\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}\\ S^{-1} =&& \tfrac{1}{\sqrt{5}} &\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} \end{align} \]

Using the components \((A, \Lambda, S, S^{-1},u_0)\):

\[ \begin{align} u_k &= A^ku_0 \\ u_k &= S\Lambda^kS^{-1}u_o \\ u_k &= S\Lambda^k\frac{1}{\sqrt{5}}\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}u_0 \\ u_k &= \frac{1}{\sqrt{5}}S\Lambda^k\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}u_0 \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1 & 0 \\ 0 & \lambda_2\end{bmatrix}^k \begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} u_0 \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k & 0 \\ 0 & \lambda_2^k\end{bmatrix} \begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} \begin{bmatrix}1\\0\end{bmatrix} \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k & 0 \\ 0 & \lambda_2^k\end{bmatrix} \begin{bmatrix}1 \\ -1 \end{bmatrix}\\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k \\ -\lambda_2^k\end{bmatrix} \end{align} \]

Verify Solution in Python

import numpy as np

lmbd_1 = (1+np.sqrt(5))/2
lmbd_2 = (1-np.sqrt(5))/2

a       = np.matrix([[1,      1],      [1,  0]])
s       = np.matrix([[lmbd_1, lmbd_2], [1,  1]])
s_inv   = np.matrix([[1,     -lmbd_2], [-1, lmbd_1]]) * (1/np.sqrt(5))
eig_val = np.matrix([[lmbd_1, 0],      [0,  lmbd_2]])
u_0     = np.matrix([[1],              [0]])

def eig_k(k):
    return eig_val ** k

def u_k(k):
    return s * eig_k(k) * s_inv * u_0

def fib(n):
    f_n = u_k(n)[1,0]
    return int(round(f_n))

[fib(n) for n in range(15)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]

Eigenvalues and Eigenvectors

All \(n\) by \(n\) matrices have \(n\) eigenvalues.

Assume:

  • \(\lambda\) is our eigenvalue (it is also a scalar)
  • \(\vec{x}\) is our eigenvector
  • \(A\) is an \(n\) by \(n\) matrix.
\begin{align*} A\vec{x} &= \lambda\vec{x}\\ A\vec{x} - \lambda\vec{x} &= 0\\ (A - \lambda I)\vec{x} &= 0 \end{align*}

Eigenvectors are defined as non-zero, so we are not interested in the case when \(\vec{x} = 0\).

\(\vec{x}\) is some non-zero vector in the nullspace of \((A - \lambda{}I)\). If \((A - \lambda{}I)\) has vectors other than \(0\) in the nullspace, it must be singular.

We can find the singular matrices with \[det(A-\lambda I) = 0\]

Example of Finding the Eigenvectors

Start by finding the eigenvalues for \(A=\begin{bmatrix}3 & 1 \\1 & 3\end{bmatrix}\)

\(A\) is setup so that the eigenvalues will be real numbers.

\[ \begin{align*} 0 &= det(A-\lambda{}I)\\ 0 &= \begin{vmatrix} 3-\lambda & 1 \\ 1 & 3 - \lambda \end{vmatrix}\\ 0 &=(3-\lambda{})^2-1\\ 0 &= 9 - 6\lambda{} - \lambda{}^2 - 1\\ 0 &= \lambda{}^2 - 6\lambda -8\\ 0 &= (\lambda - 4)(\lambda - 2) \end{align*} \]

So \(\lambda{}_1=4\) and \(\lambda{}_2=2\) Now we can plug both \(\lambda\) in to \((A-\lambda{}I)\)

\begin{align*} \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} - \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix} &= \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \\ \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} - \begin{bmatrix} 4 & 0 \\ 0 & 4 \end{bmatrix} &= \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix} \\ \end{align*}

And solve for \((A-\lambda{}I)\vec{x}=0\)

\begin{align*} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\vec{x}_1=0 &\implies \vec{x}_1= \begin{bmatrix}1\\-1\end{bmatrix}\\ \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix}\vec{x}_2=0 &\implies \vec{x}_2= \begin{bmatrix}1\\1\end{bmatrix} \end{align*}

Example of a Degenerate Matrix

Notice that the eigenvectors in the first example are independent. Not all matrices have independent eigenvectors.

\[ A = \begin{bmatrix}3 & 1 \\ 0 & 3\end{bmatrix} \]

We can read the eigenvalues directly off a triangular matrix. To see how, try finding \(det(A-\lambda{}I)\vec{x}=0\):

\begin{equation*} \begin{vmatrix} 3-\lambda & 1 \\ 0 & 3-\lambda \end{vmatrix}=0 \end{equation*}

Remember, the determinant of a triangular matrices is the product down the diagonal.

\((3-\lambda )(3-\lambda )=0 \implies \lambda{}_1 = 3, \lambda{}_2 = 3\)

Repeated eigenvalues are not a problem. The problem comes when we try to solve \((A-\lambda{}I)\vec{x}=0\)

\begin{equation*} \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \vec{x} = 0 \end{equation*}

There is only one indepedent solution: \(\vec{x}=\begin{bmatrix}1\\0\end{bmatrix}\). We cannot diagonalize \(A\).

Diagonalization \(S^{-1}AS= \Lambda{}\)

Assume we take our \(n\) linearly independent eigenvectors of \(A\)

\begin{equation*} S = \begin{bmatrix} & & \\ \vec{x}_1 & \cdots & \vec{x}_n \\ & & \end{bmatrix} \end{equation*}

What happens when we take \(AS\)?

\begin{equation*} AS = A\begin{bmatrix} & & \\ \vec{x}_1 & \cdots & \vec{x}_n \\ & & \end{bmatrix} = \begin{bmatrix} & & \\ \lambda{}_1\vec{x}_1 & \cdots & \lambda{}_n\vec{x}_n \\ & & \end{bmatrix} \end{equation*}

Remember that \(AS\) is a linear combination of the colums of \(A\). Because \(\vec{x}_1\) is an eigenvector, the first column of \(AS\) is going to be \(\lambda{}_1\vec{x}_1\), and the subsequent columns follow the same pattern.

Now we want to factor out \(\lambda{}\) from \(AS\).

\begin{equation*} AS = \begin{bmatrix} & & \\ \lambda{}_1\vec{x}_1 & \cdots & \lambda{}_n\vec{x}_n \\ & & \end{bmatrix} = \begin{bmatrix} & & \\ \vec{x}_1 && \cdots && \vec{x}_n \\ & & \end{bmatrix} \begin{bmatrix} \lambda{}_1 & & \\ & \ddots & \\ & & \lambda{}_n \end{bmatrix} \end{equation*}

We will call this last diagonal matrix \(\Lambda\), and we can now say \(AS=S\Lambda\), which gives us the following two equations:

\[S^{-1}AS = \Lambda\] \[S\Lambda{}S^{-1}=A\]

Remember: We can only invert \(S\) is we have \(n\) independent eigenvectors

Which gives us the most sought after equation:

\[ A^2= S\Lambda{}S^{-1}S\Lambda{}S^{-1}=S\Lambda{}^2S^{-1} \implies A^k=S\Lambda{}^kS^{-1} \]

Theorem: \(A^k \rightarrow 0\) as \(k \rightarrow \infty\) if all \(|\lambda{}_i| < 1\)

Eigenvalues, Eigenvectors of \(A^2\)

If we have \(A\) with eigenvalues \(\lambda{}_1 \ldots \lambda{}_n\):

  • The eigenvalues of \(A^2\) are \((\lambda{}_1)^2 \ldots (\lambda{}_n)^2\)
  • The eigenvectors of \(A^2\) the same as te eigenvectors of \(A\)

Said another way:

If \(A\vec{x} = \lambda \vec{x}\) then \(A^2\vec{x} = \lambda A\vec{x} = \lambda{}^2\vec{x}\)

Facts

  • The sum of the \(n\) eigenvalues is equal to the trace of the matrix (the sum down the diagonal).
  • The product of the eigenvalues is equal to the determinate if the matrix has \(n\) distinct eigenvalues.
  • A triangular matrix , \(U\) has eigenvalues along the diagonal - they are the pivots.
  • \(A\) has one or more eigenvalues \(\lambda =0\) exactly when \(A\) is singular
  • We can multiply eigenvectors by any non-zero constant, and \(A\vec{x} = \lambda \vec{x}\) will remain true
  • Symmetric matrices always have real eigenvalues
  • Elimination changes eigenvalues

Facts About Diagonalizing

  • A matrix with fewer than \(n\) eigenvectors cannot be diagonalized
  • Any matrix that has no repeated eigenvalues can be diagonalized
  • \(A\) is sure to have \(n\) independent eigenvectors (and be diagonalizable) if all the \(\lambda{}\)'s are different (no repeated \(\lambda{}\)s).
  • If \(A\) does not have \(n\) independent \(\lambda\)s, it might be diagonizable.

Calculate the Nullspace of a Matrix

We want to find the nullspace of \(A\)

\begin{equation} A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 3 & 4 \\ 4 & 3 & 2 & 1 \end{bmatrix} \end{equation}

\(A\) and \(U\) have the same nullspace. Elimination brings us to \(U\)

\begin{equation*} A \implies \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & -1 & -2 & -3 \end{bmatrix} \implies \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} =U \end{equation*}

We identify our pivot columns and our free colums. The pivot columns, \(x_1\), and \(x_2\) contain our pivots. The other columns, \(x_3\) and \(x_4\), are free. Now we have two choices: back subsitution, and solving by row reduced echelon form.

Back Subsitution

For each free column, \(x_n\):

  1. Set that col to 1
  2. Set all the other free columns to 0
  3. Solve all equations for 0 with back subsitution

The upper triangular matrix above, \(U\), becomes:

\begin{align} x_1 &+& x_2 &+& x_3 &+& x_4 &= 0 \\ && x_2 &+& 2x_3 &+& 3x_4 &= 0 \\ && && && 0 &= 0 \end{align}

Do back subsitution with \(x_3 = 1\) and \(x_4 = 0\):

\begin{align} 0 &= x_2 + 2 &\implies& x_2 = -2 \\ 0 &= x_1 -2 + 1 &\implies& x_1 = 1 \end{align} \begin{equation*} A \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0\end{bmatrix} = 0 \end{equation*}

Do back subsitution with \(x_3 = 0\) and \(x_4 = 1\):

\begin{align} 0 &= x_2 + 3 &\implies& x_2 = -3 \\ 0 &= x_1 -3 + 1 &\implies& x_1 = 2 \end{align} \begin{equation*} A \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} = 0 \end{equation*}

One way to write the complete solution:

\begin{equation*} x_3 \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4 \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} x_3 + 2x_4 \\ -2x_3 - 3x_4 \\ x_3 \\ x_4 \end{bmatrix} \end{equation*}

Row Reduced Echelon Form

However, it is quicker to "complete" elimination, and find the solution from rref. \(A\), \(U\), and \(R\) have the same nullspace. Our example matrix, \(A\), is very close to rref already (the pivots are already 1).

\begin{equation*} \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} \implies \begin{bmatrix} 1 & 0 & -1 & -2 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} \end{equation*}

Notice how if we set \(x_3 = 1\) and \(x_4 = 0\) we can read the values \(x_1\) and \(x_2\) from the third column (if we reverse the sign)

\(A \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0\end{bmatrix} = 0\)

You cannot do back subsitution on a matrix in rref. This yields incorrect results.

Projections and Projection Matrices

As a simple example we project on to a line. First, calculate \(e\) and \(p\).

\begin{align} 0 &= \vec{e} \bullet{}\vec{a} && \vec{e} \perp \vec{a} \text{ so the dot product is zero}\\ 0 &= (\vec{b}-\hat{x}\vec{a})\bullet{}\vec{a} && \text{notice } \vec{e} = \vec{b}-\vec{p} = \vec{b}-\hat{x}\vec{a}\\ 0 &= \vec{b}\bullet \vec{a} - \vec{a} \hat{x}\bullet \vec{a} && \text{distribute } \vec{a}\\ 0 &= (\vec{b} \bullet \vec{a}) - \hat{x} (\vec{a} \bullet \vec{a}) && \text{notice } \hat{x} \text{ is a scalar, so the last term can be } \hat{x} \vec{a} \bullet \vec{a}\\ \hat{x} (\vec{a} \bullet \vec{a}) &= (\vec{b} \bullet \vec{a})\\ \hat{x} &= \dfrac{\vec{b} \bullet \vec{a}}{\vec{a} \bullet \vec{a}} \implies \vec{p} = \dfrac{\vec{b} \bullet \vec{a}}{\vec{a} \bullet \vec{a}} \vec{a} && \text{because } \vec{p} = \hat{x}\vec{a} \end{align}

We can summarize the projection to the matrix as, \(\vec{p} = P\vec{b}\). This is easy to find if we present the equation the other way around. Notice that with vectors the order is inconsequential, because \(\vec{a} \bullet \vec{b} = \vec{b} \bullet \vec{a}\)

\begin{align} 0 &= \vec{a} \bullet \vec{e} && \vec{a}, \vec{e} \text{ on opposite sides}\\ 0 &= (\vec{a} \bullet \vec{b}) - \hat{x}(\vec{a} \bullet \vec{a})\\ \hat{x}(\vec{a} \bullet \vec{a}) &= (\vec{a} \bullet \vec{b})\\ \hat{x} &= \dfrac{\vec{a} \bullet \vec{b}}{\vec{a} \bullet \vec{b}} = \dfrac{\vec{a}^T\vec{b}}{\vec{a}^T\vec{a}} \end{align}

With \(\hat{x}\) in this form, we can calculate the permutation matrix, \(P\) from \(\vec{p} = P\vec{b}\)

\begin{align} \vec{p} &= \vec{a}\dfrac{\vec{a}^T\vec{b}}{\vec{a}^T\vec{a}}\\ \vec{p} &= \dfrac{\vec{a}\vec{a}^T}{\vec{a}^T\vec{a}}\vec{b} \implies P = \dfrac{\vec{a}\vec{a}^T}{\vec{a}^T\vec{a}} && \text{because } \vec{p}= P\vec{b} \end{align}

Project Onto a Plane in \(\mathbb{R}^3\)

Note that \(\vec{a}^T\vec{a}\) in the denominator evaluates to a scalar. If we want to project onto a plane, the denominator becomes \(A^TA\), which evaluates to a matrix.

  • Division by a matrix has no meaning. You cannot divide by a martix.
  • In arithmatic, we undo multiplication with division.
  • To undo the effects of a matrix multiplication, we multiply the inverse of the matrix. \(A^{-1}A\vec{b}=\vec{b}\)
  • We must re-arrange the equation so that we use inverse matrices instead of division.

We can think of projecting onto a plane as projecting onto multiple vectors. To project \(\vec{b}\) onto \(A\), we are looking for the vector \(\hat{x}\), such that \(\vec{p}=A\hat{x}\), where \(\vec{p}\) is the point on the plane closest to \(\vec{b}\). The first step is to find the vector \(\hat{x}\).

Like the first example, we define the error vector, \(\vec{e}\), as the vector that goes from the plane to \(\vec{b}\)

\begin{equation*} \vec{e} = \vec{b} - A\hat{x} \end{equation*}

Assume \(A\) is a matrix made of two vectors, \(a_1\) and \(a_2\) in \(\mathbb{R}^3\):

\begin{equation} A = \begin{bmatrix} a_{11} & a_{21} \\ a_{12} & a_{22} \\ a_{13} & a_{23} \end{bmatrix} \end{equation}

Our error vector, \(\vec{e}\) will be perpendicular to both \(a_1\) and \(a_2\). We want to find the closest point on \(A\) to \(\vec{b}\). Set \(\vec{e}\) and \(\vec{a}\) perpendicular \(0 = \vec{a}_n^T\vec{e}\)

\begin{align} 0 &= \vec{a}_1^T(\vec{b} - A\hat{x})\\ 0 &= \vec{a}_2^T(\vec{b} - A\hat{x}) \end{align}

There is a simple way to write the equation that captures all components at once.

\(A^T(\vec{b}-A\hat{x})\)

Which can be written as

\(A^T\vec{b} = A^TA\hat{x}\)

Which can be solved for \(\hat{x}\)

\((A^TA)^{-1}A^T\vec{b} = \hat{x}\)

Now that we have the \(\hat{x}\) vector, we can find the projection matrix. Remeber that \(\vec{p} = P\vec{b}\). If we can arrange the equation above correctly, it gives us \(P\).

\(\vec{p} = A\hat{x}\)

Subsitute \(\hat{x}\)

\(\vec{p} = A(A^TA)^{-1}A^T\vec{b}\)

Now the equation is in the form \(\vec{p} = P\vec{b}\), so

\(P = A(A^TA)^{-1}A^T\)

Getting Npm, Anaconda, Jupyter, and Brew to Play Together

I chose to use Anaconda to manage my Python installation in favor of the OSX stock Python 2.7 or brew'd Python. This makes it easy to switch between Python 2 and Python 3, and it works well with Project Jupyter.

I would like to default to Python 3.4.3, because that works best with iPython notebooks in Project Jupyter. But npm packages often use node-gyp to build, and node-gyp is not compatible with Python 3. When I tried to install the npm fsevents package, I got the following error:

gyp ERR! configure error 
gyp ERR! stack Error: Python executable "python" is v3.4.3, which is not supported by gyp.
gyp ERR! stack You can pass the --python switch to point to Python >= v2.5.0 & < 3.0.0.
gyp ERR! stack     at failPythonVersion (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:108:14)
...

It is possible to specify a python version when npm installing npm install --python=python2.7. We can also configure npm to use a specific version of python (stackoverflow).

$ npm config set python python2.7

This solved the node-gyp problem, but created another problem. By default, npm modules installed with the -g are installed to /usr/local. You can check where global npm modules are stored with

$ npm config get prefix

I usually just use sudo when installing global npm modules in OSX. However, for some reason the when I sudo npm install -g with with npm configured to use Python 2.7, Python ran as a different user without admin privileges, and failed. (Does anyone know why?)

It is safe to use sudo when npm installing, but I would prefer to sudo as little as possible. To solve the problem, I configured npm to install global packages to a directory that does not require admin privileges.

It is not difficult to configure npm to use a custom path for storing global modules (stackoverflow).

$ npm config set prefix '~/.npm-packages'

Then add the following to .bashrc (linux) .bash_profile (osx)

export PATH="$PATH:$HOME/.npm-packages/bin"

This allows us to npm install -g without sudo. It also allows Anaconda to manage python installations, giving us Python 3.4.3 by default in the terminal.