Modern Wanderlust in Games

It used to be that when I started a new RPG I got this amazing feeling of excitement when I began to explore the world. The thrilling feeling of entering a new area and wondering what secrets could be hidden, waiting to be discovered.

Getting older, I experienced this less and less. The novelty of exploring a virtual world seemed to wear off as I got older. Then in 2011 I (along with the rest of the world) found Minecraft. The experience was fresh and exciting. A great implementation of a different approach to generating a virtual world. Something about knowing that world is infinite and unlike the world that anyone else is playing made exploration feel meaningful again.

Once I began to internalize the patterns of terrain in Minecraft the feeling of excitement began to ebb. It was still exciting to find diamonds, but the random terrain began to feel repetitive and similar. I wanted to find something not random in the world - I wanted to find something that felt intelligent. What if there were a way to stumble on the ruins of other users civilizations, decayed after many years of abandonment?

Could a hybrid approach to world building be the next step in the evolution of wanderlust in games? I started writing this game for mobile devices, but my skills at the time were not enough to implement the networking required to simulate the dynamic world I imagined. Then I found Meteor, a javascript framework that makes it very easy to synchronize client data. I re-implemented the game, migrating from the the Moai game engine to Meteor with the front-end entirely written in a custom canvas based game engine. Here is my lightning talk at a 2013 meteor meetup in San Francisco.

There were some pretty awesome features in the last meteor version. Users could write their own custom servers, characters could move between servers with a complex three way authentication handshake. But I wasn't following Ken Levine's excellent game development advice: "Stop trying to be smart, and start trying not to be dumb". The engine and features were too complex, and it didn't make sense for me to be maintaining my own javascript game engine. So in early 2017 I started re-writing the game again. This time the server is written in Go, and the client built on top of PIXI.js canvas engine. Both decisions have already paid off.

In one way or another I have been working on this game since the first mobile version 2012. It's nearly always in some part of my mind -- even when I am too busy to be actively developing. Whenever I have a spare moment my mind wanders back to the most recent development challenge. There's some inexplicable force that drives me to keep at it. I envision a dynamic and evolving world in the cloud. A dwarf fortress for the next generation. A Minecraft for the future. A limitless RPG for that can fill others with the same wanderlust that I found in RPGs of my childhood.

Try the current version at PixelAether.com.

Creating Tempic Integrations

A polytempic accelerando

A polytempic accelerando

Early American Experimental Music

American composer Charles Ives (1874-1954) wrote many experimental pieces where he would selectively disregard certain 'rules' of music theory, and explore the sonic results. In an essay on his experimental music, Ives asked this question:

If you can play a tune in one key, why can't a feller, if he feels like, play one in two keys?

Ives' early polytonal experiments include pieces with a melody in one musical key and accompaniment in another key, and pieces with separate voices, each in separate keys. These experimental pieces were not published or performed until long after he wrote them, but Ives' ideas did appear in his less systematic compositions. More importantly his ideas influenced a generation of younger composers.

Among these younger composers was Henry Cowell (1897-1965), who in 1930 published New Musical Resources, a collection of largely unexplored musical ideas. The book extends the notion of multiple simultaneous musical keys polytonal music to multiple simultaneous tempos, polytempic music.

Can we make music with multiple simultaneous tempos? Certainly. But Cowell took the idea a step further. Assume we have multiple simultaneous tempos and these tempos are accelerating or decelerating relative to each other. In New Musical Resources, Cowell called this practice sliding tempo, and he noted that this would not be easy to do. In 1930, he writes:

For practical purposes, care would have to be exercised in the use of sliding tempo, in order to control the relation between tones in a sliding part with those in another part being played at the same time: a composer would have to know in other words, what tones in a part with rising tempo would be struck simultaneously with other tones in a part of, say, fixed tempo, and this from considerations of harmony. There would usually be no absolute coincidence, but the tones which would be struck at approximately the same time could be calculated.

Phase Music

Can we calculate the times of coincidence like Cowell suggests? How? Can a musician even perform polytempic music with tempo accelerations or decelerations? Over the course of the 20th century, musicians explored this idea in what is now called phase music (wikipedia). One early example is Piano Phase (1967) by Steve Reich (wikipedia, vimeo).

Through experimentation, Steve Reich found that if the music is reasonably simple, two performers can make synchronized tempo adjustments relative to each other well enough to yield compelling results. In Piano Phase two pianos both play the same repeating 12 note musical pattern. One musician plays at a slightly faster rate than the other. It is possible for a musician to play this, but there is a limit to complexity of the patterns that a musician can play and a limit to the number of simultaneous changing tempi a musician can comprehend. Piano Phase approaches the limit of what a musician can perform. Can we not use electronic sequencers or digital audio tools to synthesize complex phase music with many simultaneous tempic transitions?

Most phase music (including Piano Phase) layers identical copies of musical content playing at slightly different speeds. The layers begin in-phase with with each other, and gradually drift out of phase. Reich's Piano Phase is no exception. As the layers drift out of phase, our listening attention is drawn to variations caused by shifting phase relationships between the layers. In the example below, the two layers start out playing together at 90 beats-per-minute (BPM), at \(t=0\) the top layer abruptly starts playing at 100 BPM.

Abrupt tempo change

Abrupt tempo change

If we wait long enough, eventually the layers will re-align, and and the whole pattern will start from the beginning again. But what if we want to control exactly when the layers re-synchronize? Can we specify precisely both the time and the phase of sync points without abrupt changes in tempo? How can we sequence patterns like the one below exactly? This is the central question of Tempic Integrations.

Continuous tempo acceleration from 90 BPM to 120 BPM

Continuous tempo acceleration from 90 BPM to 120 BPM

Tempic Integrations

If we want to specify the exact number of beats that elapse between the beginning of a tempo acceleration and the end point of the acceleration, and we want to hear this precisely we need a mathematical model for shaping our tempo acceleration curves. The goal is to compose and audition music where:

  • Swarms of an arbitrary number of simultaneous tempi coexist.
  • Each individual player within the swarm can continuously accelerate or decelerate individually, but also as a member of a cohesive whole.
  • Each musical line can converge and diverge at explicit points. At each point of convergence the phase of the meter within the tempo can be set.

We start by defining a single tempo transition. Consider the following example from the image above:

  • Assume we have 2 snare drum players. Both begin playing the same beat at 90 BPM in common time.
  • One performer gradually accelerates relative to the other. We want to define a continuous tempo curve such that one drummer accelerates to 120 BPM.
  • So far, we can easily accomplish this with a simple linear tempo acceleration. However, we want the tempo transition to complete exactly when both drummers are on a down-beat, so the combined effect is a 3 over 4 rhythmic pattern. Linear acceleration results in the transition completing at an arbitrary phase.
  • We want the accelerating drummer to reach the new tempo after exactly 20 beats.
  • We also want the acceleration to complete in exactly 16 beats of the original tempo, so that the drummer playing a constant tempo and the accelerating drummer are playing together.

We are interested in both the number of beats elapsed in the static tempo and in the changing tempo, as well as the absolute tempo. If we think of the number of beats elapsed as our position, and the tempo as our rate, we see how this resembles a physics problem. If we have a function that describes our tempo (or rate), we can integrate that function, and the result will tell us our number of beats elapsed (or position). Given the above considerations, our tempo curve is defined in terms of 5 constants:

  • Time \(t_0=0\), when the tempo transition begins
  • A known time, \(t_1\), when the tempo transition ends
  • A known starting tempo: \(\dot{x}_0\)
  • A known finishing tempo: \(\dot{x}_1\)
  • The number of beats elapsed in the changing tempo between \(t_0\) and \(t_1\): \(x_1\)

The tension of the tempo curve determines how many beats elapse during the transition period. The curve is well-defined for some starting acceleration \(a_0\) and finishing acceleration \(a_1\), so we define the curve in terms of linear acceleration. Using Newtonian notation we can describe our tempo acceleration as:

\[\begin{equation} \ddot{x}_1 = a_0 + a_1t_1 \end{equation}\]

Integrating linear acceleration (above) yields a quadratic velocity curve (below). The velocity curve describes the tempo (in beats per minute) with respect to time.

\[\begin{equation} \dot{x}_1 = \dot{x}_0 + a_0t_1 + \frac{a_1t_1^2}{2} \end{equation}\]

Integrating velocity (above) gives us a function describing position (the number of beats elapsed with respect to time, below).

\[\begin{equation} x_1 = x_0 + \dot{x}_0t_1 + \frac{a_0t_1^2}{2} + \frac{a_1t_1^3}{6} \end{equation}\]

With equations for \(\dot{x}_1\) and \(x_1\), we can solve for our two unknowns, \(a_0\) and \(a_1\). First we solve both equations for \(a_1\):

\[\begin{equation} a_1 = \frac{-2}{t_1^2}(\dot{x}_0-\dot{x}_1 + a_0t_1) = \frac{-6}{t_1^3}(\dot{x}_0t_1-x_1 + \frac{a_0t_1^2}{2}) \end{equation}\]

Assuming \(t_1 \neq 0\), we solve this system of equations for \(a_0\):

\[\begin{equation} a_0=\frac{6x_1-2t_1(\dot{x}_1+2\dot{x}_0)}{t_1^2} \end{equation}\]

Evaluating for \(a_0\) with our constants gives us our starting acceleration. Once we have \(a_0\) we can solve our velocity equation to find the value of \(a_1\). Plugging \(a_1\) and \(a_0\) into the velocity equation will give us velocity curve that produces a continuous BPM curve over the course of the tempo transition.

Note: We must specify the same time units for input variables like \(t_1\) and \(\dot{x}_1\). I prefer minutes for \(t_1\) and beats per minute for \(\dot{x}_1\) over seconds and beats per second.

Calculating Fibonacci Numbers with Matrices and Linear Algebra

In this post we solve the Fibonacci sequence using linear algebra. The Fibonacci equation is a second-order difference equation which is a particular type of sequence.

Sequences

A sequence is a (possibly infinite) set of numbers with a defined order.

\[ a_n = \frac{1}{n}, \textit{ for } n = 0, 1, 2, ..., \]

Difference Equations

A difference equation is a type of sequence. A difference equation expresses the value of the sequence relative to the value of another point in the sequence.

\[ x_{n+1} = 1.5x_n, \textit{ for } n = 0, 1, 2, ..., \]

To evaluate this equation at \(n=3\) we need to know the initial value \(x_0\), and evaluate \(n=1, 2, 3\) in succession. For example, if \(x_0=100\):

\[\begin{align} x_0 & && = 100 \\ x_1 & = 1.5 \times 100 && = 150\\ x_2 & = 1.5 \times 150 && = 225\\ x_3 & = 1.5 \times 225 && = 337.5 \end{align}\]

In the example above, we evaluate the equation recursively. It would be better to be able to evaluate the equation at \(n = 100\) without performing \(100\) calculations. Note that we can also write the equation above in this form:

\[ x_3 = 100 \times 1.5 \times 1.5 \times 1.5 = 337.5 \]

Solving a difference equation means writing the same equation such that it can be evaluated at a given \(n\) without first evaluating \(n=1,2,etc\). The solution to the equation above is:

\[ x_{n} = 1.5^{n} \times x_0, \textit{ for } n = 0, 1, 2, ..., \]

Second Order Difference Equations

The previous example is a first order difference equation because it expresses the value of \(x_n\) as a function of one other sample, \(x_{n-1}\). A second order difference equation is a function of two other samples, for example

\[ x_{n+2} = 1.5x_n + 1.4x_{n+1}, \textit{ for } n = 0, 1, 2, ..., \]

Fibonacci Numbers

The easiest way to describe the Fibonacci sequence is with a second order difference equation. Assume \(F_0 = 0\) and \(F_1 = 1\)

\[ F_{k+2} = F_{k+1} + F_{k} \]

The first eight Fibonacci numbers are:

\[ F_k=0,1,1,2,3,5,8,13,... \]

We can solve the sequence with linear algebra. Let us start our sequence with a vector, \(u_0=\begin{bmatrix}1\\0\end{bmatrix}\). Find the next vector in our sequence, \(u_1\) like this:

\[ u_{1} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

More generally:

\[ u_{k+1} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} u_k \]

Note that \(u_0\) contains \(\begin{bmatrix}F_1\\F_0\end{bmatrix}\).

The key to solving the sequence is in the eigenvalues and eigenvectors of \(\begin{bmatrix}1&1\\1&0\end{bmatrix}\). We can write \(u_3\) as

\[ u_3 = \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 \\ 0\end{bmatrix} = \begin{bmatrix}3 \\ 2\end{bmatrix} = \begin{bmatrix}F_4 \\ F_3\end{bmatrix} \]

import numpy as np
fib = np.matrix([[1,1],[1,0]])
u_0 = np.matrix([[1],[0]])

fib * fib * fib * u_0 # The * operator is dot product for np.matrix (but not np.array)
matrix([[3],
        [2]])

Eigenvector Solution

The general case of the previous example is:

\[ \begin{align} u_k &= \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix}^k\begin{bmatrix}1 \\ 0\end{bmatrix}\\ u_k &= A^ku_0 \end{align} \]

We can use eigenvectors and eigenvalues to solve the matrix power (see Eigenvalues and Eigenvectors). First, find the eigenvalues of \(A\):

\[ \begin{align} det\begin{vmatrix} 1 - \lambda & 1\\ 1 & - \lambda \end{vmatrix} & = 0\\ -\lambda(1-\lambda) - 1 & = 0\\ \lambda{}^2 - \lambda - 1 & = 0 \end{align} \]

Solve using the quadratic formula \(\frac{-b \pm \sqrt{b^2-4ac}}{2a} \implies \frac{1 \pm \sqrt{5}}{2} \approx \frac{1 \pm 2.236}{2}\\\), which gives the eigenvalues for our matrix: \[ \begin{align} \lambda_1 & \approx &1.618 \\ \lambda_2 & \approx &-0.618 \end{align} \]

Now we plug our constant eigenvalues (\(\lambda_1\) and \(\lambda_2\)) into \((A-\lambda I)\vec{x}=0\) to find the eigen vectors.

\[ \begin{bmatrix}1-\lambda_1 & 1 \\ 1 & -\lambda_1 \end{bmatrix}\vec{x}_1 = 0 \implies \vec{x}_1 = \begin{bmatrix}\lambda_1 \\ 1\end{bmatrix} \]

Note that this solution is not necessarily intuitive. This is not a valid solution for any arbitrary value of \(\lambda\): our values, \(\lambda_1\) and \(\lambda_2\) are valid solutions. We can use the same process to solve for \(\lambda_2\), revealing eigenvectors \(\begin{bmatrix}\lambda_1 \\1\end{bmatrix}\) and \(\begin{bmatrix}\lambda_2 \\1\end{bmatrix}\). Now create the eigenvector matrix, \(S\) and eigenvalue matrix, \(\Lambda\).

\[ \begin{align} S &= \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix}\\ \Lambda &= \begin{bmatrix}\lambda_1 & 0 \\ 0 & \lambda_2\end{bmatrix} \end{align} \]

Remeber that \(A^k = S^{-1}\Lambda^kS\). To proceed, invert \(S\). Remember this shortcut for inverting a \(2 \times 2\) matrix.

\[ \begin{align} \begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} =&& \frac{1}{ad-cb} &\begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \\ S^{-1} =&& &\begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix}^{-1} \\ S^{-1} =&& \frac{1}{\lambda_1-\lambda_2} &\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}\\ S^{-1} =&& \tfrac{1}{\sqrt{5}} &\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} \end{align} \]

Using the components \((A, \Lambda, S, S^{-1},u_0)\):

\[ \begin{align} u_k &= A^ku_0 \\ u_k &= S\Lambda^kS^{-1}u_o \\ u_k &= S\Lambda^k\frac{1}{\sqrt{5}}\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}u_0 \\ u_k &= \frac{1}{\sqrt{5}}S\Lambda^k\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}u_0 \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1 & 0 \\ 0 & \lambda_2\end{bmatrix}^k \begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} u_0 \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k & 0 \\ 0 & \lambda_2^k\end{bmatrix} \begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} \begin{bmatrix}1\\0\end{bmatrix} \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k & 0 \\ 0 & \lambda_2^k\end{bmatrix} \begin{bmatrix}1 \\ -1 \end{bmatrix}\\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k \\ -\lambda_2^k\end{bmatrix} \end{align} \]

Verify Solution in Python

import numpy as np

lmbd_1 = (1+np.sqrt(5))/2
lmbd_2 = (1-np.sqrt(5))/2

a       = np.matrix([[1,      1],      [1,  0]])
s       = np.matrix([[lmbd_1, lmbd_2], [1,  1]])
s_inv   = np.matrix([[1,     -lmbd_2], [-1, lmbd_1]]) * (1/np.sqrt(5))
eig_val = np.matrix([[lmbd_1, 0],      [0,  lmbd_2]])
u_0     = np.matrix([[1],              [0]])

def eig_k(k):
    return eig_val ** k

def u_k(k):
    return s * eig_k(k) * s_inv * u_0

def fib(n):
    f_n = u_k(n)[1,0]
    return int(round(f_n))

[fib(n) for n in range(15)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]

Eigenvalues and Eigenvectors

All \(n\) by \(n\) matrices have \(n\) eigenvalues.

Assume:

  • \(\lambda\) is our eigenvalue (it is also a scalar)
  • \(\vec{x}\) is our eigenvector
  • \(A\) is an \(n\) by \(n\) matrix.
\[\begin{align*} A\vec{x} &= \lambda\vec{x}\\ A\vec{x} - \lambda\vec{x} &= 0\\ (A - \lambda I)\vec{x} &= 0 \end{align*}\]

Eigenvectors are defined as non-zero, so we are not interested in the case when \(\vec{x} = 0\).

\(\vec{x}\) is some non-zero vector in the nullspace of \((A - \lambda{}I)\). If \((A - \lambda{}I)\) has vectors other than \(0\) in the nullspace, it must be singular.

We can find the singular matrices with \[det(A-\lambda I) = 0\]

Example of Finding the Eigenvectors

Start by finding the eigenvalues for \(A=\begin{bmatrix}3 & 1 \\1 & 3\end{bmatrix}\)

\(A\) is setup so that the eigenvalues will be real numbers.

\[ \begin{align*} 0 &= det(A-\lambda{}I)\\ 0 &= \begin{vmatrix} 3-\lambda & 1 \\ 1 & 3 - \lambda \end{vmatrix}\\ 0 &=(3-\lambda{})^2-1\\ 0 &= 9 - 6\lambda{} - \lambda{}^2 - 1\\ 0 &= \lambda{}^2 - 6\lambda -8\\ 0 &= (\lambda - 4)(\lambda - 2) \end{align*} \]

So \(\lambda{}_1=4\) and \(\lambda{}_2=2\) Now we can plug both \(\lambda\) in to \((A-\lambda{}I)\)

\[\begin{align*} \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} - \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix} &= \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \\ \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} - \begin{bmatrix} 4 & 0 \\ 0 & 4 \end{bmatrix} &= \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix} \\ \end{align*}\]

And solve for \((A-\lambda{}I)\vec{x}=0\)

\[\begin{align*} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\vec{x}_1=0 &\implies \vec{x}_1= \begin{bmatrix}1\\-1\end{bmatrix}\\ \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix}\vec{x}_2=0 &\implies \vec{x}_2= \begin{bmatrix}1\\1\end{bmatrix} \end{align*}\]

Example of a Degenerate Matrix

Notice that the eigenvectors in the first example are independent. Not all matrices have independent eigenvectors.

\[ A = \begin{bmatrix}3 & 1 \\ 0 & 3\end{bmatrix} \]

We can read the eigenvalues directly off a triangular matrix. To see how, try finding \(det(A-\lambda{}I)\vec{x}=0\):

\[\begin{equation*} \begin{vmatrix} 3-\lambda & 1 \\ 0 & 3-\lambda \end{vmatrix}=0 \end{equation*}\]

Remember, the determinant of a triangular matrices is the product down the diagonal.

\((3-\lambda )(3-\lambda )=0 \implies \lambda{}_1 = 3, \lambda{}_2 = 3\)

Repeated eigenvalues are not a problem. The problem comes when we try to solve \((A-\lambda{}I)\vec{x}=0\)

\[\begin{equation*} \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \vec{x} = 0 \end{equation*}\]

There is only one indepedent solution: \(\vec{x}=\begin{bmatrix}1\\0\end{bmatrix}\). We cannot diagonalize \(A\).

Diagonalization \(S^{-1}AS= \Lambda{}\)

Assume we take our \(n\) linearly independent eigenvectors of \(A\)

\[\begin{equation*} S = \begin{bmatrix} & & \\ \vec{x}_1 & \cdots & \vec{x}_n \\ & & \end{bmatrix} \end{equation*}\]

What happens when we take \(AS\)?

\[\begin{equation*} AS = A\begin{bmatrix} & & \\ \vec{x}_1 & \cdots & \vec{x}_n \\ & & \end{bmatrix} = \begin{bmatrix} & & \\ \lambda{}_1\vec{x}_1 & \cdots & \lambda{}_n\vec{x}_n \\ & & \end{bmatrix} \end{equation*}\]

Remember that \(AS\) is a linear combination of the colums of \(A\). Because \(\vec{x}_1\) is an eigenvector, the first column of \(AS\) is going to be \(\lambda{}_1\vec{x}_1\), and the subsequent columns follow the same pattern.

Now we want to factor out \(\lambda{}\) from \(AS\).

\[\begin{equation*} AS = \begin{bmatrix} & & \\ \lambda{}_1\vec{x}_1 & \cdots & \lambda{}_n\vec{x}_n \\ & & \end{bmatrix} = \begin{bmatrix} & & \\ \vec{x}_1 && \cdots && \vec{x}_n \\ & & \end{bmatrix} \begin{bmatrix} \lambda{}_1 & & \\ & \ddots & \\ & & \lambda{}_n \end{bmatrix} \end{equation*}\]

We will call this last diagonal matrix \(\Lambda\), and we can now say \(AS=S\Lambda\), which gives us the following two equations:

\[S^{-1}AS = \Lambda\] \[S\Lambda{}S^{-1}=A\]

Remember: We can only invert \(S\) is we have \(n\) independent eigenvectors

Which gives us the most sought after equation:

\[ A^2= S\Lambda{}S^{-1}S\Lambda{}S^{-1}=S\Lambda{}^2S^{-1} \implies A^k=S\Lambda{}^kS^{-1} \]

Theorem: \(A^k \rightarrow 0\) as \(k \rightarrow \infty\) if all \(|\lambda{}_i| < 1\)

Eigenvalues, Eigenvectors of \(A^2\)

If we have \(A\) with eigenvalues \(\lambda{}_1 \ldots \lambda{}_n\):

  • The eigenvalues of \(A^2\) are \((\lambda{}_1)^2 \ldots (\lambda{}_n)^2\)
  • The eigenvectors of \(A^2\) the same as te eigenvectors of \(A\)

Said another way:

If \(A\vec{x} = \lambda \vec{x}\) then \(A^2\vec{x} = \lambda A\vec{x} = \lambda{}^2\vec{x}\)

Facts

  • The sum of the \(n\) eigenvalues is equal to the trace of the matrix (the sum down the diagonal).
  • The product of the eigenvalues is equal to the determinate if the matrix has \(n\) distinct eigenvalues.
  • A triangular matrix , \(U\) has eigenvalues along the diagonal - they are the pivots.
  • \(A\) has one or more eigenvalues \(\lambda =0\) exactly when \(A\) is singular
  • We can multiply eigenvectors by any non-zero constant, and \(A\vec{x} = \lambda \vec{x}\) will remain true
  • Symmetric matrices always have real eigenvalues
  • Elimination changes eigenvalues

Facts About Diagonalizing

  • A matrix with fewer than \(n\) eigenvectors cannot be diagonalized
  • Any matrix that has no repeated eigenvalues can be diagonalized
  • \(A\) is sure to have \(n\) independent eigenvectors (and be diagonalizable) if all the \(\lambda{}\)'s are different (no repeated \(\lambda{}\)s).
  • If \(A\) does not have \(n\) independent \(\lambda\)s, it might be diagonizable.

Calculate the Nullspace of a Matrix

We want to find the nullspace of \(A\)

\[\begin{equation} A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 3 & 4 \\ 4 & 3 & 2 & 1 \end{bmatrix} \end{equation}\]

\(A\) and \(U\) have the same nullspace. Elimination brings us to \(U\)

\[\begin{equation*} A \implies \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & -1 & -2 & -3 \end{bmatrix} \implies \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} =U \end{equation*}\]

We identify our pivot columns and our free colums. The pivot columns, \(x_1\), and \(x_2\) contain our pivots. The other columns, \(x_3\) and \(x_4\), are free. Now we have two choices: back subsitution, and solving by row reduced echelon form.

Back Subsitution

For each free column, \(x_n\):

  1. Set that col to 1
  2. Set all the other free columns to 0
  3. Solve all equations for 0 with back subsitution

The upper triangular matrix above, \(U\), becomes:

\[\begin{align} x_1 &+& x_2 &+& x_3 &+& x_4 &= 0 \\ && x_2 &+& 2x_3 &+& 3x_4 &= 0 \\ && && && 0 &= 0 \end{align}\]

Do back subsitution with \(x_3 = 1\) and \(x_4 = 0\):

\[\begin{align} 0 &= x_2 + 2 &\implies& x_2 = -2 \\ 0 &= x_1 -2 + 1 &\implies& x_1 = 1 \end{align}\] \[\begin{equation*} A \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0\end{bmatrix} = 0 \end{equation*}\]

Do back subsitution with \(x_3 = 0\) and \(x_4 = 1\):

\[\begin{align} 0 &= x_2 + 3 &\implies& x_2 = -3 \\ 0 &= x_1 -3 + 1 &\implies& x_1 = 2 \end{align}\] \[\begin{equation*} A \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} = 0 \end{equation*}\]

One way to write the complete solution:

\[\begin{equation*} x_3 \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4 \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} x_3 + 2x_4 \\ -2x_3 - 3x_4 \\ x_3 \\ x_4 \end{bmatrix} \end{equation*}\]

Row Reduced Echelon Form

However, it is quicker to "complete" elimination, and find the solution from rref. \(A\), \(U\), and \(R\) have the same nullspace. Our example matrix, \(A\), is very close to rref already (the pivots are already 1).

\[\begin{equation*} \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} \implies \begin{bmatrix} 1 & 0 & -1 & -2 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} \end{equation*}\]

Notice how if we set \(x_3 = 1\) and \(x_4 = 0\) we can read the values \(x_1\) and \(x_2\) from the third column (if we reverse the sign)

\(A \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0\end{bmatrix} = 0\)

You cannot do back subsitution on a matrix in rref. This yields incorrect results.

Projections and Projection Matrices

As a simple example we project on to a line. First, calculate \(e\) and \(p\).

\[\begin{align} 0 &= \vec{e} \bullet{}\vec{a} && \vec{e} \perp \vec{a} \text{ so the dot product is zero}\\ 0 &= (\vec{b}-\hat{x}\vec{a})\bullet{}\vec{a} && \text{notice } \vec{e} = \vec{b}-\vec{p} = \vec{b}-\hat{x}\vec{a}\\ 0 &= \vec{b}\bullet \vec{a} - \vec{a} \hat{x}\bullet \vec{a} && \text{distribute } \vec{a}\\ 0 &= (\vec{b} \bullet \vec{a}) - \hat{x} (\vec{a} \bullet \vec{a}) && \text{notice } \hat{x} \text{ is a scalar, so the last term can be } \hat{x} \vec{a} \bullet \vec{a}\\ \hat{x} (\vec{a} \bullet \vec{a}) &= (\vec{b} \bullet \vec{a})\\ \hat{x} &= \dfrac{\vec{b} \bullet \vec{a}}{\vec{a} \bullet \vec{a}} \implies \vec{p} = \dfrac{\vec{b} \bullet \vec{a}}{\vec{a} \bullet \vec{a}} \vec{a} && \text{because } \vec{p} = \hat{x}\vec{a} \end{align}\]

We can summarize the projection to the matrix as, \(\vec{p} = P\vec{b}\). This is easy to find if we present the equation the other way around. Notice that with vectors the order is inconsequential, because \(\vec{a} \bullet \vec{b} = \vec{b} \bullet \vec{a}\)

\[\begin{align} 0 &= \vec{a} \bullet \vec{e} && \vec{a}, \vec{e} \text{ on opposite sides}\\ 0 &= (\vec{a} \bullet \vec{b}) - \hat{x}(\vec{a} \bullet \vec{a})\\ \hat{x}(\vec{a} \bullet \vec{a}) &= (\vec{a} \bullet \vec{b})\\ \hat{x} &= \dfrac{\vec{a} \bullet \vec{b}}{\vec{a} \bullet \vec{b}} = \dfrac{\vec{a}^T\vec{b}}{\vec{a}^T\vec{a}} \end{align}\]

With \(\hat{x}\) in this form, we can calculate the permutation matrix, \(P\) from \(\vec{p} = P\vec{b}\)

\[\begin{align} \vec{p} &= \vec{a}\dfrac{\vec{a}^T\vec{b}}{\vec{a}^T\vec{a}}\\ \vec{p} &= \dfrac{\vec{a}\vec{a}^T}{\vec{a}^T\vec{a}}\vec{b} \implies P = \dfrac{\vec{a}\vec{a}^T}{\vec{a}^T\vec{a}} && \text{because } \vec{p}= P\vec{b} \end{align}\]

Project Onto a Plane in \(\mathbb{R}^3\)

Note that \(\vec{a}^T\vec{a}\) in the denominator evaluates to a scalar. If we want to project onto a plane, the denominator becomes \(A^TA\), which evaluates to a matrix.

  • Division by a matrix has no meaning. You cannot divide by a martix.
  • In arithmatic, we undo multiplication with division.
  • To undo the effects of a matrix multiplication, we multiply the inverse of the matrix. \(A^{-1}A\vec{b}=\vec{b}\)
  • We must re-arrange the equation so that we use inverse matrices instead of division.

We can think of projecting onto a plane as projecting onto multiple vectors. To project \(\vec{b}\) onto \(A\), we are looking for the vector \(\hat{x}\), such that \(\vec{p}=A\hat{x}\), where \(\vec{p}\) is the point on the plane closest to \(\vec{b}\). The first step is to find the vector \(\hat{x}\).

Like the first example, we define the error vector, \(\vec{e}\), as the vector that goes from the plane to \(\vec{b}\)

\[\begin{equation*} \vec{e} = \vec{b} - A\hat{x} \end{equation*}\]

Assume \(A\) is a matrix made of two vectors, \(a_1\) and \(a_2\) in \(\mathbb{R}^3\):

\[\begin{equation} A = \begin{bmatrix} a_{11} & a_{21} \\ a_{12} & a_{22} \\ a_{13} & a_{23} \end{bmatrix} \end{equation}\]

Our error vector, \(\vec{e}\) will be perpendicular to both \(a_1\) and \(a_2\). We want to find the closest point on \(A\) to \(\vec{b}\). Set \(\vec{e}\) and \(\vec{a}\) perpendicular \(0 = \vec{a}_n^T\vec{e}\)

\[\begin{align} 0 &= \vec{a}_1^T(\vec{b} - A\hat{x})\\ 0 &= \vec{a}_2^T(\vec{b} - A\hat{x}) \end{align}\]

There is a simple way to write the equation that captures all components at once.

\(A^T(\vec{b}-A\hat{x})\)

Which can be written as

\(A^T\vec{b} = A^TA\hat{x}\)

Which can be solved for \(\hat{x}\)

\((A^TA)^{-1}A^T\vec{b} = \hat{x}\)

Now that we have the \(\hat{x}\) vector, we can find the projection matrix. Remeber that \(\vec{p} = P\vec{b}\). If we can arrange the equation above correctly, it gives us \(P\).

\(\vec{p} = A\hat{x}\)

Subsitute \(\hat{x}\)

\(\vec{p} = A(A^TA)^{-1}A^T\vec{b}\)

Now the equation is in the form \(\vec{p} = P\vec{b}\), so

\(P = A(A^TA)^{-1}A^T\)