3.6. Chapter 6: Theoretical Linear Algebra#

Before Class:

During and After Class:

  • Take notes (on paper or a tablet computer)

  • Complete this notebook, submit you answer via Gradescope

import sys
if "google.colab" in sys.modules:
    # If running on colab, install ipympl
    !pip install -q ipympl

    # Enable 3rd party output/widgets
    from google.colab import output
    output.enable_custom_widget_manager()

# % means a "magic command" that is special
# for Jupyter notebooks.
# This specific command enables interactive plots
%matplotlib ipympl
# Here are our standard imports
import matplotlib.pyplot as plt
import numpy as np
import scipy.linalg as la

3.6.1. Our First Eigendecomposition#

Calculate the eigendecomposition of

\[\begin{split} {\bf A}_0 = \begin{bmatrix} 1 & 2 & 0 \\ 0 & 3 & 0 \\ 2 & -4 & 2 \end{bmatrix} \end{split}\]

This example is from pg. 302 - 303 of Savov (2020).

We start by writing the definition of an eigenvalue:

\[ {\bf A}_0 {\bf w}_i = \lambda_i {\bf w}_i \]
\[ \left({\bf A}_0 - \lambda_i {\bf I} \right) {\bf w}_i = {\bf 0} \]
\[ \text{det}\left({\bf A}_0 - \lambda {\bf I} \right) = 0 \]
\[\begin{split} \text{det}\left( \begin{bmatrix} 1 - \lambda & 2 & 0 \\ 0 & 3 - \lambda & 0 \\ 2 & -4 & 2 - \lambda \end{bmatrix} \right) = 0 \end{split}\]

Now let’s compute the determinant:

\[\begin{split} \begin{align*} \text{det}\left( \begin{bmatrix} 1 - \lambda & 2 & 0 \\ 0 & 3 - \lambda & 0 \\ 2 & -4 & 2 - \lambda \end{bmatrix} \right) &= (1 - \lambda) \text{det} \left(\begin{bmatrix} 3 - \lambda & 0 \\ -4 & 2 - \lambda \end{bmatrix} \right) + 2 ~ \text{det} \left(\begin{bmatrix} 0 & 0 \\ 2 & 2 - \lambda \end{bmatrix}\right) + 0 ~ \text{det} \left(\begin{bmatrix} 0 & 3 - \lambda \\ 2 & -4 \end{bmatrix}\right) \\ &= (1-\lambda) \left( (3 - \lambda)(2 - \lambda) + 4 \cdot 0 \right) - 2 \left(0 (2 - \lambda) - 0 \cdot 2 \right) + 0 \left( -0 \cdot 4 - 2 (3 - \lambda) \right) \\ & = (1-\lambda) (3 - \lambda) (2 - \lambda) \end{align*} \end{split}\]

This gives the characteristic polynomial:

\[ \text{det}\left({\bf A}_0 - \lambda {\bf I} \right) = 0 = (1-\lambda) (3 - \lambda) (2 - \lambda) \]

By inspection, see the roots of the characteristic equation are \(\lambda = 3\), \(\lambda = 2\), and \(\lambda = 1\) (written from largest to smallest by convention). These are the eigenvalues. (Yes, this is textbook example, hence the math was easy.)

Next, we need to calculate the eigenvector that corresponds to each eigenvalue. To do this, we need to calculate the null space of \(({\bf A}_0 - \lambda_i {\bf I})\) for each \(i\).

To calculate the null space, we need to convert \({\bf A}_0 - \lambda_i {\bf I}\) into reduced row eschelon form. You can do this by hand of course, which would be excellent practice. We’ll use sympy to speed up this example.

 from sympy import Matrix

def rref_eigensystem(eigval=1):
    ''' Calculate the reduced row eschelon form of
    A0 - eigval * I
    for this specific example, i.e., A0 is hard coded
    into this function for simplicity

    Argument:
        eigval: eigenvalue

    '''

   
    # Define A
    A_ = Matrix([[1 - eigval, 2, 0], [0, 3 - eigval, 0], [2, -4, 2 - eigval]])
    #print("A - eigval I=\n",A_)
    # Calculate reduced row eschelon form
    rref_ = A_ .rref()
    print("rref =")
    print(rref_)

rref_eigensystem(1)

Now, we need to go from RREF to the null space. Let’s use the RREF to write a system of linear equations:

\[\begin{split} \begin{bmatrix} 1 & 0 & \frac{1}{2} \\ 0 & 1 & 0 \\ 0 & 0 & 0 \\ \end{bmatrix} ~ \begin{bmatrix} x_1 \\ x_2 \\ t \end{bmatrix} = {\bf 0} \end{split}\]

Notice in the RREF matrix, there is no pivot (1 on the diagonal) for the column corresponding for \(x_3\). Thus, we replaced \(x_3\) with \(t\) to denote it is a free variable. From this linear system, we obtain:

\[ x_1 + \frac{t}{2} = 0, \qquad x_2 = 0, \qquad 0 = 0 \]

With a little algebra, we obtain:

\[ x_1 = \frac{-t}{2}, \quad x_2 = 0, \quad x_3 = t \]

Thus the solution to the linear system is:

\[\begin{split} {\bf x} = \begin{bmatrix} -1 \\ 0 \\ 2 \end{bmatrix} t, \quad \forall t \in \mathbb{R} \end{split}\]

Thus,

\[\begin{split} \text{null}({\bf A}_0 - 1 {\bf I}) = \text{span}\left( \begin{bmatrix} -1 \\ 0 \\ 2 \end{bmatrix} \right) \end{split}\]

Let’s verify this with Python.

def null_eigen(eigval=1):
    ''' Calculate the null space
    A0 - eigval * I
    for this specific example, i.e., A0 is hard coded
    into this function for simplicity

    Argument:
        eigval: eigenvalue

    '''
    
    A_ = np.array([[1 - eigval, 2, 0], [0, 3 - eigval, 0], [2, -4, 2 - eigval]])

    print("null space =\n", la.null_space(A_))

null_eigen(1)

Let’s compare this to out by hand answer.

w3 = np.array([-1, 0, 2])
print("eigenvector 3 (unscaled) =", w3)

w3 = w3 / la.norm(w3)

print("eigenvector 3 (scaled) =", w3)

As homework, calculate the remaining two eigenvectors on paper and then verify with Python. In other words, repeat the procedure above. You may use Python to calculate the RREFs, but you should calculate the null space with paper.

# Verify the first eigenvector here (\lambda = 3)
# Verify the second eigenvector here (\lambda = 2)

3.6.2. More Eigendecompositions#

Calculate the eigendecomposition of the following matrices by hand and then verify with Python. Please complete the entire eigendecomposition calculation on paper, including find the eigenvectors (RREF, null space). This is practice for the quiz.

\[\begin{split} {\bf A}_1 = \begin{bmatrix} 2 & 0 \\ 0 & -3 \end{bmatrix} \end{split}\]
def print_eigen(A):
    ''' Calculate eigendecomposition and print information

    '''

    print("A =\n",A)
    w, l = la.eig(A)
    print("eigenvalues = ",w)
    print("eigenvectors =\n", l)
    print("rank =", np.linalg.matrix_rank(A))
    print("condition number =", np.linalg.cond(A))

A = np.array([[2, 0], [0, -3]])
print_eigen(A)
\[\begin{split} {\bf A}_2 = \begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} \end{split}\]
A2 = np.array([[1, 0], [1, 0]])
print_eigen(A2)
\[\begin{split} {\bf A}_3 = \begin{bmatrix} 2 & 1 \\ 0 & 2 \end{bmatrix} \end{split}\]
A3 = np.array([[2, 1], [0, 2]])
print_eigen(A3)

3.6.3. Matrix Power#

Let’s say we want to compute

\[ {\bf A}^n = {\bf A A} ... {\bf A } \]

This requires \(n\) matrix multiplications, which is tedious.

Recall, we decompose a diagonalizable matrix into its eigenvectors and eigenvalues.

\[ {\bf A} = {\bf Q} {\bf \Lambda} {\bf Q}^{-1} \]

Here \({\bf Q}\) is the matrix of right eigenvectors and \({\bf \Lambda} = \text{diag}(\lambda_1, \lambda_2, ...)\). If \(\bf A\) is symmetric, then \({\bf Q}^T\) = \({\bf Q}^{-1}\).

Now let’s plug this into the matrix exponential:

\[ {\bf A}^n = \left({\bf Q} {\bf \Lambda} {\bf Q}^{-1} \right)^n = \left({\bf Q} {\bf \Lambda} {\bf Q}^{-1} \right) \left({\bf Q} {\bf \Lambda} {\bf Q}^{-1} \right) ... \left({\bf Q} {\bf \Lambda} {\bf Q}^{-1} \right) \]

Finally, we can simplify as \({\bf Q}^{-1} {\bf Q} = {\bf I}\)

\[ {\bf A}^n = {\bf Q} {\bf \Lambda}^n {\bf Q}^{-1} \]

This is easy to compute because \({\bf \Lambda}\) is a diagonal matrix, thus:

\[\begin{split} {\bf \Lambda}^n = \begin{bmatrix} (\lambda_1)^n & 0 & \dots \\ 0 & (\lambda_2)^n & \dots \\ \vdots & \vdots & \ddots \end{bmatrix} \end{split}\]

Calculate

\[\begin{split} \left( {\bf A}_3 \right)^3 = \left( \begin{bmatrix} 2 & 1 \\ 0 & 2 \end{bmatrix} \right)^3 \end{split}\]

with pencil and paper two ways: simple matrix multiplication and with the eigendecomposition approach. (Hint, you already calculcated the eigendecomposition.) Then, verify your calculation with Python.

# Add your solution here

3.6.4. Tranformations as Projections onto the Eigenbasis#

Recall, we can represent a linear transformation \(T({\bf x})\) as the matrix vector product \(\bf A x\). Let’s interpret the linear transformation \(\bf A\) using an eigendecomposition. This requires \({\bf A}\) to be a diagonalizable matrix.

\[ \bf{A} = \sum_i \lambda_i {\bf w}_i {\bf w}_i^T \]

Here, \(\lambda_i\) are the eigenvalues and \({\bf w}_i\) are the corresponding eigenvectors.

\[ \bf{A x} = \sum_i \lambda_i \left( {\bf w}_i {\bf w}_i^T {\bf x} \right) \]

Also recall that dot products are communitive, i.e., \({\bf w}_i^T {\bf x} = {\bf x}^T {\bf w}_i = {\bf x} \cdot {\bf w}_i\).

\[ \bf{A x} = \sum_i \lambda_i \left( {\bf w}_i {\bf x} \cdot {\bf w}_i \right) \]

Recall that \({\bf w}_i\) are unit vectors, i.e., \(||{\bf w}_i|| = 1\) and rearrange:

\[ \bf{A x} = \sum_i \lambda_i \left( \frac{{\bf x} \cdot {\bf w}_i}{|| {\bf w}_i ||^2} {\bf w}_i \right) \]

This means we can intrepet the matrix multiplication \(\bf{A x}\) as projecting \({\bf x}\) onto each eigenvector \({\bf w}_i\) and scaling by the eigenvalue \(\lambda_i\)

\[ \bf{A x} = \sum_i \lambda_i \left( \Pi_{u} {\bf x} \right) \]

Let’s visualize this in 3D.

from mpl_toolkits.mplot3d import Axes3D

# Plot vectors
def plot_vector(vector, color, label, p = [0,0,0], linestyle="-"):
    ''' Plot a 3D vector
        Arguments:
            vector: np array
            color: 'r', 'b', 'g', etc.
            label: string for legend
            p: point to start vector
        
    '''
    assert len(vector) == 3, "Must be a 3D vector"
    assert len(p) == 3, "p must have length 3"

    ax = plt.gca()
    ax.quiver(p[0], p[1], p[2], vector[0], vector[1], vector[2], color=color, arrow_length_ratio=0.1, label=label, linestyle=linestyle)
# Define a linear transformation
A = np.array([[1,0,-0.5], [0, 1, 0], [-0.5, 1, 0.5]])
print("A\n=",A)

First, let’s verify \({\bf A}\) is normal, i.e., \({\bf{A}}^T{\bf A} = {\bf A}{\bf{A}}^T\). Recall, a normal matrix is also diagonalizable.

print("A.T@A=\n",A.T@A)
print("A@A.T=\n",A.T@A)
x = [1, 1, 1]

print("A =\n",A)
print("x =", x)
print("Ax =", A@x)

# Calculate eigendecomposition
l, w = la.eig(A)
print("eigenvalues = ",l)
print("eigenvectors =\n", w)

# Create empty plot
fig = plt.figure()
ax = Axes3D(fig, auto_add_to_figure=False)
fig.add_axes(ax)

colors = ['blue','red','green']

plot_vector(x, 'black', r'${\bf x}$')

# Plot eigenvectors
for i in range(3):
    plot_vector(w[:,i],colors[i],r'${\bf w}_'+str(i+1)+'$')

# Plot projections
for i in range(3):
    proj = np.dot(x, w[:,i]) * w[:,i] / np.linalg.norm(w[:,i])**2
    #plot_vector(proj, 'black', r'$\lambda_'+str(i+1)+'\Pi_{w_'+str(i+1)+'}} {x}$')
    #plot_vector(proj, 'black', r'$\Pi_{w_'+str(i+1)+'}} x$', linestyle='-.')
    plot_vector(proj, 'black', r'$\mathrm{proj.}~{\bf x}~\mathrm{onto}~{\bf w}_{'+str(i)+'}$', linestyle='-.')

plot_vector(A @ x, 'purple', r'${\bf A x}$')

# Set labels and limits
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
ax.set_xlim([-2,2])
ax.set_ylim([-2,2])
ax.set_zlim([-2,2])

plt.legend(ncol=4)

3.6.5. SVD and Matrix Compression#

See Chapter 5: Linear Transformations for a quick introduction to SVD and its relationship to condition number.

How do we calculate the SVD of a matrix?

The left signular vectors are the eigenvectors of \({\bf A}{\bf A}^T\).

The right singular vectors are the eigenvectors of \({\bf A}^T{\bf A}\).

The eigenvalues \(\lambda_i\) of \({\bf A}{\bf A}^T\) and \({\bf A}^T{\bf A}\) are equal. Moreover, the singular values \(\sigma_i = \sqrt{\lambda_i}\).

Thus, the linear tranformation \(T_{A}({\bf A}) = {\bf A x}\) can be interpreted as a three step process:

  1. Convert \({\bf x}\) from the standard basis to the basis of the right singular vectors

  2. Scale componenets by the singular values

  3. Convert the output from the basis of the left singular vectors to the standard basis

Finally, we can write a matrix using SVD as follows:

\[ {\bf A} = {\bf U \Sigma V}^T = \sum_{i} \sigma_i {\bf u}_i {\bf v}^{T}_i \]

Thus, we can compress \({\bf A}\) by truncated the summation. Let’s see this with an example.

# Disable the interactive plots
%matplotlib inline


# Import opencv
import cv2
  
# Import a picture of Prof. Dowling in grayscale
if "google.colab" in sys.modules:
    # Download
    !wget 'https://dowlinglab.nd.edu/assets/320333/dowling_alex_sq.jpg'
    image_location = 'dowling_alex_sq.jpg'
else:
    # Use local copy already in repo (no internet required)
    image_location = '../../media/dowling_alex_sq.jpg'

img = cv2.imread(image_location, 0)
print(img)
print(img.shape)
def show_image(img,title=None):
    ''' Plot grayscale image stored as a matrix
    '''

    plt.matshow(img, cmap='gray', vmin=0, vmax=255)
    plt.title(title)

show_image(img,'Original')
U, S, Vh = la.svd(img, full_matrices=True)

plt.semilogy(np.abs(S))
plt.xlabel("Index",fontsize=18)
plt.ylabel(r"$|~\sigma_i~|$",fontsize=18)
plt.show()
def compress_image(original_image,n=10):
    ''' Compress imagage

    Arguments:
        original_image: matrix with values represented greyscale colors
        n: number of eigenvalues to keep

    Actions:
        Show compressed image

    
    '''

    # Singular value decomposition
    U, S, Vh = la.svd(original_image, full_matrices=True)


    # Compress matrix
    compressed = U[:,0:n] @ np.diag(S[0:n]) @ Vh[0:n,:]

    # Plot image
    show_image(compressed ,'n='+str(n))
# Using only 10 largest singular values (and corresponding vectors)
compress_image(img, 10)
# Using only 20 largest singular values (and corresponding vectors)
compress_image(img, 20)
# Using only 100 largest singular values (and corresponding vectors)
compress_image(img, 100)

3.6.5.1. SVD to Diagnose Linearly Dependent Equations#

A4 = np.array([[1, 2, 0], [0, 1, 1], [1, 0, -2]])
print("A4=\n",A4)

Notice that the row 2 equals (row 0) minus 2 times (row 1). In other words, this system of equations is rank deficient.

np.linalg.matrix_rank(A4)

3.6.5.2. Calculate SVD#

U, S, Vh = la.svd(A4, full_matrices=True)

print("left singular vectors\nU=\n",U)
print("singular values\nS=\n",S)
print("right singular vectors (transposed)\nVh=\n",Vh)

3.6.5.3. Interpret left singular vector#

Let’s scale the left singular vector corresponding to the near-zero singular value.

# Original unit vector
print("unit vector",U[:,-1])
# Scale by the inverse of the lower left corner of U
print("scaled vector",U[:,-1] / U[0,-1])

This tells us that 1 times (row 0) plus -2 times (row 1) plus -1 times (row 2) equals 0. Let’s verify.

1*A4[0,:] - 2*A4[1,:] - 1*A4[2,:]

Take Away: The left singular vector corresponding to a near-zero singular value tells us how to add the rows of a matrix together to achieve a rank deficiency. Thus, SVD helps us identify the linearly dependent equations.

3.6.5.4. Interpret right singular vector#

Let’s scale the right singular vector corresponding to the near-zero singular value.

# Original unit vector
print("unit vector",Vh[-1,:])
# Scale by the inverse of the lower right corner of Vh
print("scaled vector",Vh[-1,:] / Vh[-1,-1])

This tells us that 2 times (columun 0) plus -1 times (column 1) plus 1 times (column 2) equals 0. Let’s verify.

2*A4[:,0] - 1*A4[:,1] + 1*A4[:,2]

Take Away: The right singular vector corresponding to a near-zero singular value tells us how to add the columns of a matrix together to achieve a rank deficiency. SVD is a very help tool to find modeling mistakes.

3.6.5.5. Homework#

Create your own example below to show how SVD helps diagnose a linearly dependent system of equations. Specifically:

  1. Propose a linearly dependent system of equations (as an \(\bf A\) matrix).

  2. Perform SVD

  3. Interpret the left and right singular vectors corresponding to the near-zero singular value.

Everyone should submit a unique \(\bf A\) matrix.

# Define and print A
# Calculate SVD and print components
# Rescale left singular vector

Your Interpretation Here: [replace with 1 sentence]

# Rescale right singular vector

Your Interpretation Here: [replace with 1 sentence]