Calculating Fibonacci Numbers with Matrices and Linear Algebra

In this post we solve the Fibonacci sequence using linear algebra. The Fibonacci equation is a second-order difference equation which is a particular type of sequence.

Sequences

A sequence is a (possibly infinite) set of numbers with a defined order.

\[ a_n = \frac{1}{n}, \textit{ for } n = 0, 1, 2, ..., \]

Difference Equations

A difference equation is a type of sequence. A difference equation expresses the value of the sequence relative to the value of another point in the sequence.

\[ x_{n+1} = 1.5x_n, \textit{ for } n = 0, 1, 2, ..., \]

To evaluate this equation at \(n=3\) we need to know the initial value \(x_0\), and evaluate \(n=1, 2, 3\) in succession. For example, if \(x_0=100\):

\[\begin{align} x_0 & && = 100 \\ x_1 & = 1.5 \times 100 && = 150\\ x_2 & = 1.5 \times 150 && = 225\\ x_3 & = 1.5 \times 225 && = 337.5 \end{align}\]

In the example above, we evaluate the equation recursively. It would be better to be able to evaluate the equation at \(n = 100\) without performing \(100\) calculations. Note that we can also write the equation above in this form:

\[ x_3 = 100 \times 1.5 \times 1.5 \times 1.5 = 337.5 \]

Solving a difference equation means writing the same equation such that it can be evaluated at a given \(n\) without first evaluating \(n=1,2,etc\). The solution to the equation above is:

\[ x_{n} = 1.5^{n} \times x_0, \textit{ for } n = 0, 1, 2, ..., \]

Second Order Difference Equations

The previous example is a first order difference equation because it expresses the value of \(x_n\) as a function of one other sample, \(x_{n-1}\). A second order difference equation is a function of two other samples, for example

\[ x_{n+2} = 1.5x_n + 1.4x_{n+1}, \textit{ for } n = 0, 1, 2, ..., \]

Fibonacci Numbers

The easiest way to describe the Fibonacci sequence is with a second order difference equation. Assume \(F_0 = 0\) and \(F_1 = 1\)

\[ F_{k+2} = F_{k+1} + F_{k} \]

The first eight Fibonacci numbers are:

\[ F_k=0,1,1,2,3,5,8,13,... \]

We can solve the sequence with linear algebra. Let us start our sequence with a vector, \(u_0=\begin{bmatrix}1\\0\end{bmatrix}\). Find the next vector in our sequence, \(u_1\) like this:

\[ u_{1} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

More generally:

\[ u_{k+1} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} u_k \]

Note that \(u_0\) contains \(\begin{bmatrix}F_1\\F_0\end{bmatrix}\).

The key to solving the sequence is in the eigenvalues and eigenvectors of \(\begin{bmatrix}1&1\\1&0\end{bmatrix}\). We can write \(u_3\) as

\[ u_3 = \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix} \begin{bmatrix}1 \\ 0\end{bmatrix} = \begin{bmatrix}3 \\ 2\end{bmatrix} = \begin{bmatrix}F_4 \\ F_3\end{bmatrix} \]

import numpy as np
fib = np.matrix([[1,1],[1,0]])
u_0 = np.matrix([[1],[0]])

fib * fib * fib * u_0 # The * operator is dot product for np.matrix (but not np.array)
matrix([[3],
        [2]])

Eigenvector Solution

The general case of the previous example is:

\[ \begin{align} u_k &= \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix}^k\begin{bmatrix}1 \\ 0\end{bmatrix}\\ u_k &= A^ku_0 \end{align} \]

We can use eigenvectors and eigenvalues to solve the matrix power (see Eigenvalues and Eigenvectors). First, find the eigenvalues of \(A\):

\[ \begin{align} det\begin{vmatrix} 1 - \lambda & 1\\ 1 & - \lambda \end{vmatrix} & = 0\\ -\lambda(1-\lambda) - 1 & = 0\\ \lambda{}^2 - \lambda - 1 & = 0 \end{align} \]

Solve using the quadratic formula \(\frac{-b \pm \sqrt{b^2-4ac}}{2a} \implies \frac{1 \pm \sqrt{5}}{2} \approx \frac{1 \pm 2.236}{2}\\\), which gives the eigenvalues for our matrix: \[ \begin{align} \lambda_1 & \approx &1.618 \\ \lambda_2 & \approx &-0.618 \end{align} \]

Now we plug our constant eigenvalues (\(\lambda_1\) and \(\lambda_2\)) into \((A-\lambda I)\vec{x}=0\) to find the eigen vectors.

\[ \begin{bmatrix}1-\lambda_1 & 1 \\ 1 & -\lambda_1 \end{bmatrix}\vec{x}_1 = 0 \implies \vec{x}_1 = \begin{bmatrix}\lambda_1 \\ 1\end{bmatrix} \]

Note that this solution is not necessarily intuitive. This is not a valid solution for any arbitrary value of \(\lambda\): our values, \(\lambda_1\) and \(\lambda_2\) are valid solutions. We can use the same process to solve for \(\lambda_2\), revealing eigenvectors \(\begin{bmatrix}\lambda_1 \\1\end{bmatrix}\) and \(\begin{bmatrix}\lambda_2 \\1\end{bmatrix}\). Now create the eigenvector matrix, \(S\) and eigenvalue matrix, \(\Lambda\).

\[ \begin{align} S &= \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix}\\ \Lambda &= \begin{bmatrix}\lambda_1 & 0 \\ 0 & \lambda_2\end{bmatrix} \end{align} \]

Remeber that \(A^k = S^{-1}\Lambda^kS\). To proceed, invert \(S\). Remember this shortcut for inverting a \(2 \times 2\) matrix.

\[ \begin{align} \begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} =&& \frac{1}{ad-cb} &\begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \\ S^{-1} =&& &\begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix}^{-1} \\ S^{-1} =&& \frac{1}{\lambda_1-\lambda_2} &\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}\\ S^{-1} =&& \tfrac{1}{\sqrt{5}} &\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} \end{align} \]

Using the components \((A, \Lambda, S, S^{-1},u_0)\):

\[ \begin{align} u_k &= A^ku_0 \\ u_k &= S\Lambda^kS^{-1}u_o \\ u_k &= S\Lambda^k\frac{1}{\sqrt{5}}\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}u_0 \\ u_k &= \frac{1}{\sqrt{5}}S\Lambda^k\begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix}u_0 \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1 & 0 \\ 0 & \lambda_2\end{bmatrix}^k \begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} u_0 \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k & 0 \\ 0 & \lambda_2^k\end{bmatrix} \begin{bmatrix}1 & -\lambda_2\\ -1 & \lambda_1 \end{bmatrix} \begin{bmatrix}1\\0\end{bmatrix} \\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k & 0 \\ 0 & \lambda_2^k\end{bmatrix} \begin{bmatrix}1 \\ -1 \end{bmatrix}\\ u_k &= \tfrac{1}{\sqrt{5}} \begin{bmatrix}\lambda_1 & \lambda_2 \\ 1 & 1\end{bmatrix} \begin{bmatrix}\lambda_1^k \\ -\lambda_2^k\end{bmatrix} \end{align} \]

Verify Solution in Python

import numpy as np

lmbd_1 = (1+np.sqrt(5))/2
lmbd_2 = (1-np.sqrt(5))/2

a       = np.matrix([[1,      1],      [1,  0]])
s       = np.matrix([[lmbd_1, lmbd_2], [1,  1]])
s_inv   = np.matrix([[1,     -lmbd_2], [-1, lmbd_1]]) * (1/np.sqrt(5))
eig_val = np.matrix([[lmbd_1, 0],      [0,  lmbd_2]])
u_0     = np.matrix([[1],              [0]])

def eig_k(k):
    return eig_val ** k

def u_k(k):
    return s * eig_k(k) * s_inv * u_0

def fib(n):
    f_n = u_k(n)[1,0]
    return int(round(f_n))

[fib(n) for n in range(15)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]

Eigenvalues and Eigenvectors

All \(n\) by \(n\) matrices have \(n\) eigenvalues.

Assume:

  • \(\lambda\) is our eigenvalue (it is also a scalar)
  • \(\vec{x}\) is our eigenvector
  • \(A\) is an \(n\) by \(n\) matrix.

\[\begin{align*} A\vec{x} &= \lambda\vec{x}\\ A\vec{x} - \lambda\vec{x} &= 0\\ (A - \lambda I)\vec{x} &= 0 \end{align*}\]

Eigenvectors are defined as non-zero, so we are not interested in the case when \(\vec{x} = 0\).

\(\vec{x}\) is some non-zero vector in the nullspace of \((A - \lambda{}I)\). If \((A - \lambda{}I)\) has vectors other than \(0\) in the nullspace, it must be singular.

We can find the singular matrices with \[det(A-\lambda I) = 0\]

Example of Finding the Eigenvectors

Start by finding the eigenvalues for \(A=\begin{bmatrix}3 & 1 \\1 & 3\end{bmatrix}\)

\(A\) is setup so that the eigenvalues will be real numbers.

\[ \begin{align*} 0 &= det(A-\lambda{}I)\\ 0 &= \begin{vmatrix} 3-\lambda & 1 \\ 1 & 3 - \lambda \end{vmatrix}\\ 0 &=(3-\lambda{})^2-1\\ 0 &= 9 - 6\lambda{} - \lambda{}^2 - 1\\ 0 &= \lambda{}^2 - 6\lambda -8\\ 0 &= (\lambda - 4)(\lambda - 2) \end{align*} \]

So \(\lambda{}_1=4\) and \(\lambda{}_2=2\) Now we can plug both \(\lambda\) in to \((A-\lambda{}I)\)

\[\begin{align*} \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} - \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix} &= \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \\ \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix} - \begin{bmatrix} 4 & 0 \\ 0 & 4 \end{bmatrix} &= \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix} \\ \end{align*}\]

And solve for \((A-\lambda{}I)\vec{x}=0\)

\[\begin{align*} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\vec{x}_1=0 &\implies \vec{x}_1= \begin{bmatrix}1\\-1\end{bmatrix}\\ \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix}\vec{x}_2=0 &\implies \vec{x}_2= \begin{bmatrix}1\\1\end{bmatrix} \end{align*}\]

Example of a Degenerate Matrix

Notice that the eigenvectors in the first example are independent. Not all matrices have independent eigenvectors.

\[ A = \begin{bmatrix}3 & 1 \\ 0 & 3\end{bmatrix} \]

We can read the eigenvalues directly off a triangular matrix. To see how, try finding \(det(A-\lambda{}I)\vec{x}=0\):

\[\begin{equation*} \begin{vmatrix} 3-\lambda & 1 \\ 0 & 3-\lambda \end{vmatrix}=0 \end{equation*}\]

Remember, the determinant of a triangular matrices is the product down the diagonal.

\((3-\lambda )(3-\lambda )=0 \implies \lambda{}_1 = 3, \lambda{}_2 = 3\)

Repeated eigenvalues are not a problem. The problem comes when we try to solve \((A-\lambda{}I)\vec{x}=0\)

\[\begin{equation*} \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \vec{x} = 0 \end{equation*}\]

There is only one indepedent solution: \(\vec{x}=\begin{bmatrix}1\\0\end{bmatrix}\). We cannot diagonalize \(A\).

Diagonalization \(S^{-1}AS= \Lambda{}\)

Assume we take our \(n\) linearly independent eigenvectors of \(A\)

\[\begin{equation*} S = \begin{bmatrix} & & \\ \vec{x}_1 & \cdots & \vec{x}_n \\ & & \end{bmatrix} \end{equation*}\]

What happens when we take \(AS\)?

\[\begin{equation*} AS = A\begin{bmatrix} & & \\ \vec{x}_1 & \cdots & \vec{x}_n \\ & & \end{bmatrix} = \begin{bmatrix} & & \\ \lambda{}_1\vec{x}_1 & \cdots & \lambda{}_n\vec{x}_n \\ & & \end{bmatrix} \end{equation*}\]

Remember that \(AS\) is a linear combination of the colums of \(A\). Because \(\vec{x}_1\) is an eigenvector, the first column of \(AS\) is going to be \(\lambda{}_1\vec{x}_1\), and the subsequent columns follow the same pattern.

Now we want to factor out \(\lambda{}\) from \(AS\).

\[\begin{equation*} AS = \begin{bmatrix} & & \\ \lambda{}_1\vec{x}_1 & \cdots & \lambda{}_n\vec{x}_n \\ & & \end{bmatrix} = \begin{bmatrix} & & \\ \vec{x}_1 && \cdots && \vec{x}_n \\ & & \end{bmatrix} \begin{bmatrix} \lambda{}_1 & & \\ & \ddots & \\ & & \lambda{}_n \end{bmatrix} \end{equation*}\]

We will call this last diagonal matrix \(\Lambda\), and we can now say \(AS=S\Lambda\), which gives us the following two equations:

\[S^{-1}AS = \Lambda\] \[S\Lambda{}S^{-1}=A\]

Remember: We can only invert \(S\) is we have \(n\) independent eigenvectors

Which gives us the most sought after equation:

\[ A^2= S\Lambda{}S^{-1}S\Lambda{}S^{-1}=S\Lambda{}^2S^{-1} \implies A^k=S\Lambda{}^kS^{-1} \]

Theorem: \(A^k \rightarrow 0\) as \(k \rightarrow \infty\) if all \(|\lambda{}_i| < 1\)

Eigenvalues, Eigenvectors of \(A^2\)

If we have \(A\) with eigenvalues \(\lambda{}_1 \ldots \lambda{}_n\):

  • The eigenvalues of \(A^2\) are \((\lambda{}_1)^2 \ldots (\lambda{}_n)^2\)
  • The eigenvectors of \(A^2\) the same as te eigenvectors of \(A\)

Said another way:

If \(A\vec{x} = \lambda \vec{x}\) then \(A^2\vec{x} = \lambda A\vec{x} = \lambda{}^2\vec{x}\)

Facts

  • The sum of the \(n\) eigenvalues is equal to the trace of the matrix (the sum down the diagonal).
  • The product of the eigenvalues is equal to the determinate if the matrix has \(n\) distinct eigenvalues.
  • A triangular matrix , \(U\) has eigenvalues along the diagonal - they are the pivots.
  • \(A\) has one or more eigenvalues \(\lambda =0\) exactly when \(A\) is singular
  • We can multiply eigenvectors by any non-zero constant, and \(A\vec{x} = \lambda \vec{x}\) will remain true
  • Symmetric matrices always have real eigenvalues
  • Elimination changes eigenvalues

Facts About Diagonalizing

  • A matrix with fewer than \(n\) eigenvectors cannot be diagonalized
  • Any matrix that has no repeated eigenvalues can be diagonalized
  • \(A\) is sure to have \(n\) independent eigenvectors (and be diagonalizable) if all the \(\lambda{}\)’s are different (no repeated \(\lambda{}\)s).
  • If \(A\) does not have \(n\) independent \(\lambda\)s, it might be diagonizable.

Calculate the Nullspace of a Matrix

We want to find the nullspace of \(A\)

\[\begin{equation} A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 3 & 4 \\ 4 & 3 & 2 & 1 \end{bmatrix} \end{equation}\]

\(A\) and \(U\) have the same nullspace. Elimination brings us to \(U\)

\[\begin{equation*} A \implies \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & -1 & -2 & -3 \end{bmatrix} \implies \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} =U \end{equation*}\]

We identify our pivot columns and our free colums. The pivot columns, \(x_1\), and \(x_2\) contain our pivots. The other columns, \(x_3\) and \(x_4\), are free. Now we have two choices: back subsitution, and solving by row reduced echelon form.

Back Subsitution

For each free column, \(x_n\):

  1. Set that col to 1
  2. Set all the other free columns to 0
  3. Solve all equations for 0 with back subsitution

The upper triangular matrix above, \(U\), becomes:

\[\begin{align} x_1 &+& x_2 &+& x_3 &+& x_4 &= 0 \\ && x_2 &+& 2x_3 &+& 3x_4 &= 0 \\ && && && 0 &= 0 \end{align}\]

Do back subsitution with \(x_3 = 1\) and \(x_4 = 0\):

\[\begin{align} 0 &= x_2 + 2 &\implies& x_2 = -2 \\ 0 &= x_1 -2 + 1 &\implies& x_1 = 1 \end{align}\]

\[\begin{equation*} A \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0\end{bmatrix} = 0 \end{equation*}\]

Do back subsitution with \(x_3 = 0\) and \(x_4 = 1\):

\[\begin{align} 0 &= x_2 + 3 &\implies& x_2 = -3 \\ 0 &= x_1 -3 + 1 &\implies& x_1 = 2 \end{align}\]

\[\begin{equation*} A \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} = 0 \end{equation*}\]

One way to write the complete solution:

\[\begin{equation*} x_3 \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4 \begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} x_3 + 2x_4 \\ -2x_3 - 3x_4 \\ x_3 \\ x_4 \end{bmatrix} \end{equation*}\]

Row Reduced Echelon Form

However, it is quicker to “complete” elimination, and find the solution from rref. \(A\), \(U\), and \(R\) have the same nullspace. Our example matrix, \(A\), is very close to rref already (the pivots are already 1).

\[\begin{equation*} \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} \implies \begin{bmatrix} 1 & 0 & -1 & -2 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix} \end{equation*}\]

Notice how if we set \(x_3 = 1\) and \(x_4 = 0\) we can read the values \(x_1\) and \(x_2\) from the third column (if we reverse the sign)

\(A \begin{bmatrix} 1 \\ -2 \\ 1 \\ 0\end{bmatrix} = 0\)

You cannot do back subsitution on a matrix in rref. This yields incorrect results.

Projections and Projection Matrices

As a simple example we project on to a line. First, calculate \(e\) and \(p\).

\[\begin{align} 0 &= \vec{e} \bullet{}\vec{a} && \vec{e} \perp \vec{a} \text{ so the dot product is zero}\\ 0 &= (\vec{b}-\hat{x}\vec{a})\bullet{}\vec{a} && \text{notice } \vec{e} = \vec{b}-\vec{p} = \vec{b}-\hat{x}\vec{a}\\ 0 &= \vec{b}\bullet \vec{a} - \vec{a} \hat{x}\bullet \vec{a} && \text{distribute } \vec{a}\\ 0 &= (\vec{b} \bullet \vec{a}) - \hat{x} (\vec{a} \bullet \vec{a}) && \text{notice } \hat{x} \text{ is a scalar, so the last term can be } \hat{x} \vec{a} \bullet \vec{a}\\ \hat{x} (\vec{a} \bullet \vec{a}) &= (\vec{b} \bullet \vec{a})\\ \hat{x} &= \dfrac{\vec{b} \bullet \vec{a}}{\vec{a} \bullet \vec{a}} \implies \vec{p} = \dfrac{\vec{b} \bullet \vec{a}}{\vec{a} \bullet \vec{a}} \vec{a} && \text{because } \vec{p} = \hat{x}\vec{a} \end{align}\]

We can summarize the projection to the matrix as, \(\vec{p} = P\vec{b}\). This is easy to find if we present the equation the other way around. Notice that with vectors the order is inconsequential, because \(\vec{a} \bullet \vec{b} = \vec{b} \bullet \vec{a}\)

\[\begin{align} 0 &= \vec{a} \bullet \vec{e} && \vec{a}, \vec{e} \text{ on opposite sides}\\ 0 &= (\vec{a} \bullet \vec{b}) - \hat{x}(\vec{a} \bullet \vec{a})\\ \hat{x}(\vec{a} \bullet \vec{a}) &= (\vec{a} \bullet \vec{b})\\ \hat{x} &= \dfrac{\vec{a} \bullet \vec{b}}{\vec{a} \bullet \vec{b}} = \dfrac{\vec{a}^T\vec{b}}{\vec{a}^T\vec{a}} \end{align}\]

With \(\hat{x}\) in this form, we can calculate the permutation matrix, \(P\) from \(\vec{p} = P\vec{b}\)

\[\begin{align} \vec{p} &= \vec{a}\dfrac{\vec{a}^T\vec{b}}{\vec{a}^T\vec{a}}\\ \vec{p} &= \dfrac{\vec{a}\vec{a}^T}{\vec{a}^T\vec{a}}\vec{b} \implies P = \dfrac{\vec{a}\vec{a}^T}{\vec{a}^T\vec{a}} && \text{because } \vec{p}= P\vec{b} \end{align}\]

Project Onto a Plane in \(\mathbb{R}^3\)

Note that \(\vec{a}^T\vec{a}\) in the denominator evaluates to a scalar. If we want to project onto a plane, the denominator becomes \(A^TA\), which evaluates to a matrix.

  • Division by a matrix has no meaning. You cannot divide by a martix.
  • In arithmatic, we undo multiplication with division.
  • To undo the effects of a matrix multiplication, we multiply the inverse of the matrix. \(A^{-1}A\vec{b}=\vec{b}\)
  • We must re-arrange the equation so that we use inverse matrices instead of division.

We can think of projecting onto a plane as projecting onto multiple vectors. To project \(\vec{b}\) onto \(A\), we are looking for the vector \(\hat{x}\), such that \(\vec{p}=A\hat{x}\), where \(\vec{p}\) is the point on the plane closest to \(\vec{b}\). The first step is to find the vector \(\hat{x}\).

Like the first example, we define the error vector, \(\vec{e}\), as the vector that goes from the plane to \(\vec{b}\)

\[\begin{equation*} \vec{e} = \vec{b} - A\hat{x} \end{equation*}\]

Assume \(A\) is a matrix made of two vectors, \(a_1\) and \(a_2\) in \(\mathbb{R}^3\):

\[\begin{equation} A = \begin{bmatrix} a_{11} & a_{21} \\ a_{12} & a_{22} \\ a_{13} & a_{23} \end{bmatrix} \end{equation}\]

Our error vector, \(\vec{e}\) will be perpendicular to both \(a_1\) and \(a_2\). We want to find the closest point on \(A\) to \(\vec{b}\). Set \(\vec{e}\) and \(\vec{a}\) perpendicular \(0 = \vec{a}_n^T\vec{e}\)

\[\begin{align} 0 &= \vec{a}_1^T(\vec{b} - A\hat{x})\\ 0 &= \vec{a}_2^T(\vec{b} - A\hat{x}) \end{align}\]

There is a simple way to write the equation that captures all components at once.

\(A^T(\vec{b}-A\hat{x})\)

Which can be written as

\(A^T\vec{b} = A^TA\hat{x}\)

Which can be solved for \(\hat{x}\)

\((A^TA)^{-1}A^T\vec{b} = \hat{x}\)

Now that we have the \(\hat{x}\) vector, we can find the projection matrix. Remeber that \(\vec{p} = P\vec{b}\). If we can arrange the equation above correctly, it gives us \(P\).

\(\vec{p} = A\hat{x}\)

Subsitute \(\hat{x}\)

\(\vec{p} = A(A^TA)^{-1}A^T\vec{b}\)

Now the equation is in the form \(\vec{p} = P\vec{b}\), so

\(P = A(A^TA)^{-1}A^T\)

Getting Npm, Anaconda, Jupyter, and Brew to Play Together

I chose to use Anaconda to manage my Python installation in favor of the OSX stock Python 2.7 or brew’d Python. This makes it easy to switch between Python 2 and Python 3, and it works well with Project Jupyter.

I would like to default to Python 3.4.3, because that works best with iPython notebooks in Project Jupyter. But npm packages often use node-gyp to build, and node-gyp is not compatible with Python 3. When I tried to install the npm fsevents package, I got the following error:

gyp ERR! configure error 
gyp ERR! stack Error: Python executable "python" is v3.4.3, which is not supported by gyp.
gyp ERR! stack You can pass the --python switch to point to Python >= v2.5.0 & < 3.0.0.
gyp ERR! stack     at failPythonVersion (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:108:14)
...

It is possible to specify a python version when npm installing npm install --python=python2.7. We can also configure npm to use a specific version of python (stackoverflow).

$ npm config set python python2.7

This solved the node-gyp problem, but created another problem. By default, npm modules installed with the -g are installed to /usr/local. You can check where global npm modules are stored with

$ npm config get prefix

I usually just use sudo when installing global npm modules in OSX. However, for some reason the when I sudo npm install -g with with npm configured to use Python 2.7, Python ran as a different user without admin privileges, and failed. (Does anyone know why?)

It is safe to use sudo when npm installing, but I would prefer to sudo as little as possible. To solve the problem, I configured npm to install global packages to a directory that does not require admin privileges.

It is not difficult to configure npm to use a custom path for storing global modules (stackoverflow).

$ npm config set prefix '~/.npm-packages'

Then add the following to .bashrc (linux) .bash_profile (osx)

export PATH="$PATH:$HOME/.npm-packages/bin"

This allows us to npm install -g without sudo. It also allows Anaconda to manage python installations, giving us Python 3.4.3 by default in the terminal.

Praise for The Name of the Wind

I love escaping into a fantasy world. At best, reading a fantasy novel is a vacation, a waking dream, a story you become a part of. After I finished reading through all 5 books in A Song of Fire and Ice for a second time, I started looking to scratch the Fantasy/SciFi itch. However, I often find fantasy novels frustrating. For example…

  • The Hobbit is amazing, but I’ve read it 18 times. LOTR is acceptable, but I finished it recently.
  • Neal Stephenson is pretty good, but his stories are about caricatures and ideas, not characters and relationships.
  • The Belgariad series was enjoyable, but felt juvenile - Every Character is obviously either good or bad. Nothing bad ever happens to any of the good characters. The obvious good guys “win” every single encounter.
  • I started Eye of the World on three separate occasions. I could tolerate the campyness, it moved just a little too slowly, and I lost interest every time.

I started reading Wizard’s First Rule, the first book in the The Sword of Truth series. But after the first ~5 chapters I gave up, frustrated.

In the best stories, I believe the characters and their actions as much as I believe my own personal relationships. A convincing cast pulls me in to the story better than anything else.

However, most fantasy stories do not have convincing characters. The most offensively unconvincing characters are love interests.

Let me outline the plot at the beginning of Wizard’s First Rule

Our hero lives on the outskirts of a small remote village. While investigating the mysterious sudden murder of his parents, he notices a beautiful female traveler walking in the wilderness alone. He also notices four sinister men who are surreptitiously stalking the traveler. Our hero meets the girl, helps her avoid, and eventually defeat her stalkers.

It turns out that the girl is actually a highly intelligent and powerful magician (not quite powerful enough to defeat the four men by herself), and while she maintains a tough exterior, she has been through traumatizing experiences, and occasionally needs the dependable kindness and emotional support that only our hero can provide. Our hero finds her quite attractive, but comes to consider her his friend, and having a kind and understanding disposition would never try to take advantage of her occasional weakened state.

I am supposed to relate to the main character. This is preying on the fantasy that an attractive and powerful girl will see what a good, wise friend I can be, open up to me for emotional and physical support, eventually leading to romantic, and finally sexual interactions… She comes to me (and only me) and opens herself to me because of my humility, wisdom, patience, and overall goodness. This story that plays out in the first few chapters of Wizard’s First Rule is unrealistic and aggravating. It’s an angle that’s overused and abused. I suspect it’s also alienating to non-male readers.

Why is this same story played out in so many other fantasy novels? As I was approaching the last chapter of Hyperion, I thought the story would finish inoffensively, but then the conclusion of the entire story depended completely on an improbable and unconvincing love story not at unlike the one in Wizard’s First Rule.

I concede that not every story needs to be written for all genders. They are fantasy novels after all. Romance novels (for example) are often written and marketed to a female audience. I imagine unrealistic male love interests are common in romance novels.

I’ve been reading The Name of the Wind by Patrick Rothfuss. It’s wonderful. It’s imperfect. It’s like being in the other world. It’s like being drunk. While I’m reading, the current world ceases to exist. I stop noticing words on the page, and the world in the story materializes in front of me. Best of all, this is a novel that gets the love story right. How? Well, the relationship is dysfunctional, doomed, and convincing. The protagonist is in his early teens, awkward, crushed, and incompetent. There are specific things he does that I relate to doing when I was his age, or in my first relationship. Awkwardness, competing interests, and ignored instincts all get in the way of the relationship. Perhaps it’s because I’ve been there that I relate to it so strongly. Perhaps nearly everyone has.

I am optimistic there are more fantasy novels like this one out there waiting to be read.