4.4. LU Decomposition#
4.4.1. Learning Objectives#
After studying this notebook, completing the activities, and asking questions in class, you should be able to:
Explain to a classmate how Gaussian elimination and LU factorization are similar and different
In Python, use LU factorization to solve a linear system of equations
For LU factorization with partial pivoting, explain the following (same as Gaussian elimination last class):
Why is it important? What are the benefits?
How does the algorithm work? What are the basic steps?
What are the limitations? What cases will partial pivoting not work?
import numpy as np
4.4.2. Motivation: Solving the “Same” System Multiple Times#
In practice, you often want to solve the same linear system multiple times with different right-hand side vectors \(\vec{b}\). For example, imagine resolving the same mass balance problem with different inlet and outlet compositions.
From the previous class, we know how to use Gaussian elimination to solve a linear system. But with Gaussian elimination, we need to redo the entire algorithm separately for each new instance of \(\vec{b}\). Is there an algorithm that lets us reuse some of our work? Spoiler alert: yes, and it is called LU factorization.
4.4.3. LU Factorization#
LU factorization writes a matrix \(\mathbf{A}\) as the product of a lower triangular matrix (that is a matrix with nonzero elements on the diagonal and below) times an upper triangular matrix (that is a matrix with nonzero elements above the diagonal). In other words, \(\mathbf{A} = \mathbf{LU},\) where
and
where the \(*\) denote a potentially nonzero matrix element.
The question remains; how we find the entries in the matrices in the LU factorization? To see this, let’s look at a matrix we used in last class:
we want to find the matrix LU such that
Home Activity
Please read through the explanation below. Think of at least one question to ask in class.
It’s not exactly clear right now, but we actually have a few choices at this point. The first choice we have is which diagonal, either \(\mathbf{L}\) or \(\mathbf{U}\), we want to be all ones. We will choose to have all the \(l_{ii} = 1\), that is the \(\mathbf{L}\) matrix has a diagonal of all 1’s. (This means we are choosing Doolittle’s method for LU factorization.) Performing the matrix multiplication we get 9 equations and 9 unknowns for the factorization. The product of the first column of \(\mathbf{U}\) by the first column of \(\mathbf{L}\) gives
Which gives us
Notice that the solutions below the diagaonal, the \(l_{21}\) and \(l_{31}\), are the factors we used in our Gaussian elimination algorithm to remove the first column in the second and third equations. And the \(u_{11}\) value is that found in the same position in our Gaussian-eliminated matrix.
Continuing on, the equations from the second column are
the solutions to these equations are
You probably don’t remember, because I didn’t, but the value of \(l_{32}\) is the value of the factor used to eliminate the second column below the diagonal in the Gaussian elimination example from the last notebook. Finally, the last three equations are
The solutions to these equations are
These are precisely the values of the third column in or Gaussian elimination example.
In other words our factorization looks like:
Home Activity
Let’s verify the above equation is correct. Create numpy arrays L and U for the example above. Then multiple matrices L and U and store the results in A.
# Add your solution here
# Removed autograder test. You may delete this cell.
Some observations:
The upper triangular matrix is the same as the matrix we received after doing Gaussian elimination
The non-zero and non-diagonal elements of the lower triangular matrix are the factors we used to arrive at our Gaussian matrix.
This suggests that we can reformulate our Gaussian elimination example to give us an LU factorization.
Home Activity
Write down questions you can ask in class about the example above.
Record your question(s) here.
Class Activity
Walk through the LU factorization example together.
4.4.4. Our First Function for LU Factorization#
We will now generalize the LU factorization algorithm into a Python function.
import numpy as np
def LU_factor1(A,LOUD=True):
'''Compute LU (Doolitte) decomposition of A
Args:
A: N by N array
Returns:
L: N by N array
U: N by N array
Side Effect:
Does not modify A
'''
# Check dimensions
[Nrow, Ncol] = A.shape
assert Nrow == Ncol
N = Nrow
# Allocate L as identify matrix
L = np.eye(N)
# Copy A into U
U = A.copy()
# Loop over columns
for c in range(0,N):
# Loop over rows
for r in range(c+1,N):
# Extract row to modify
mod_row = U[r,:]
# Calculate factor (just like Gaussian Elimination)
factor = mod_row[c]/U[c,c]
# Save factor into L
L[r,c] = factor
# Perform elimination, save into U
mod_row -= factor*U[c]
#mod_row = U[r-1,:]
if LOUD:
print("After working on column",c,"and row",r)
print("L = \n",L)
print("U = \n",U)
print("\n")
return L,U
#let's try it on a 3 x 3 to start
A = np.array([(3.0,2,1),(-1,4,5),(2,-8,10)])
print("The original matrix is\n",A)
L,U = LU_factor1(A)
#print("The LU factored A is\n",A)
print("L * U = \n")
print(L,"\n*\n",U,"\n=\n",L.dot(U))
The original matrix is
[[ 3. 2. 1.]
[-1. 4. 5.]
[ 2. -8. 10.]]
After working on column 0 and row 1
L =
[[ 1. 0. 0. ]
[-0.33333333 1. 0. ]
[ 0. 0. 1. ]]
U =
[[ 3. 2. 1. ]
[ 0. 4.66666667 5.33333333]
[ 2. -8. 10. ]]
After working on column 0 and row 2
L =
[[ 1. 0. 0. ]
[-0.33333333 1. 0. ]
[ 0.66666667 0. 1. ]]
U =
[[ 3. 2. 1. ]
[ 0. 4.66666667 5.33333333]
[ 0. -9.33333333 9.33333333]]
After working on column 1 and row 2
L =
[[ 1. 0. 0. ]
[-0.33333333 1. 0. ]
[ 0.66666667 -2. 1. ]]
U =
[[ 3. 2. 1. ]
[ 0. 4.66666667 5.33333333]
[ 0. 0. 20. ]]
L * U =
[[ 1. 0. 0. ]
[-0.33333333 1. 0. ]
[ 0.66666667 -2. 1. ]]
*
[[ 3. 2. 1. ]
[ 0. 4.66666667 5.33333333]
[ 0. 0. 20. ]]
=
[[ 3. 2. 1.]
[-1. 4. 5.]
[ 2. -8. 10.]]
We see that our function gives the same L
and U
as before. At the start of class, we will work through the LU decomposition example on the board together for this example. We will then revist the function above.
Home Activity
Perform LU decomposition for matrix A1, which is defined below. Store the results in variables L1 and U1.
A1 = np.array([[1., 2, 3],[-1, 2, 1], [1, 2, 0]])
print("A1 = \n",A1,"\n")
# Add your solution here
# Removed autograder test. You may delete this cell.
Home Activity
Spend 5 honest minutes reading through the function LU_factor1. Write down at least two questions you have right now.
My questions while inspecting the code below: 1. 2.
Class Activity
Revist your list of questions about LU_factor1 with your neighbor. We’ll then regroup and discuss as a class.
4.4.5. A More Memory Efficient Version#
Our function LU_factor1
makes a copy of A
to initialize U
. This is nice because it does not overwrite the original A
. But imagine if A
was a 10 million by 10 million matrix. Making a copy would be expensive and require a lot of memory.
Instead, the function below also performs LU factorization. But instead of returning L
and U
, it overwrites A
with the factorization result. This avoids copying large matrices.
import numpy as np
def LU_factor(A):
"""Factor in place A into L*U=A.
The lower triangular parts of A
are the L matrix.
The L has implied ones on the diagonal.
Args:
A: N by N array
Side Effects:
A is factored in place.
"""
[Nrow, Ncol] = A.shape
assert Nrow == Ncol
N = Nrow
for column in range(0,N):
for row in range(column+1,N):
mod_row = A[row]
factor = mod_row[column]/A[column,column]
mod_row = mod_row - factor*A[column,:]
#put the factor in the correct place in the modified row
mod_row[column] = factor
#only take the part of the modified row we need
mod_row = mod_row[column:N]
A[row,column:N] = mod_row
return
#let's try it on a 3 x 3 to start
A = np.array([(3.0,2,1),(-1,4,5),(2,-8,10)])
print("The original matrix is\n",A)
LU_factor(A)
print("The LU factored A is\n",A)
The original matrix is
[[ 3. 2. 1.]
[-1. 4. 5.]
[ 2. -8. 10.]]
The LU factored A is
[[ 3. 2. 1. ]
[-0.33333333 4.66666667 5.33333333]
[ 0.66666667 -2. 20. ]]
Notice how our matrix A
has changed:
print(A)
[[ 3. 2. 1. ]
[-0.33333333 4.66666667 5.33333333]
[ 0.66666667 -2. 20. ]]
Home Activity
Use LU_factor for the example below. Your matrix A1 should change.
A1 = np.array([[1., 2, 3],[-1, 2, 1], [1, 2, 0]])
print("A1 (original) = \n",A1,"\n")
# Add your solution here
print("A1 (factored) = \n",A1,"\n")
# Removed autograder test. You may delete this cell.
4.4.6. Solving Linear Systems#
This algorithm gives us the same thing our onerous, by-hand procedure did. The question we have not answered is how we can take an LU factored matrix and get the solution to the system \(\mathbf{A}\vec{x} = \vec{b}\).
Let’s start by substituting \(\mathbf{L U} = \mathbf{A}\) into our original equation:
Now let’s create a new intermediate variable \(\vec{y} = \mathbf{U} \vec{x}\). If we substitute \(\vec{y}\) into the \(\mathbf{LU}\vec{x} = \vec{b}\), we get:
Home Activity
Compute \(\vec{y}\) for the example below by hand or with Python. Store the results in a numpy array y.
# Add your solution here
# Removed autograder test. You may delete this cell.
You may have noticed in the example, we can easily compute \(\vec{y}\) because \(\mathbf{L}\) is a lower triangular matrix. We start at the top to calculate the first element of \(\vec{y}\). We then move to the next row of \(\mathbf{L}\), substitute and compute the next element of \(\vec{y}\). We call this procedure forward substitution because we start at the top.
We now have \(\vec{y}\). We are half way there! We just need to solve the linear system \(\mathbf{U}\vec{x} = \vec{y}\).
Home Activity
Compute \(\vec{x}\) for the example below either by hand or with Python. Store your answer in the numpy array x.
# Add your solution here
# Removed autograder test. You may delete this cell.
Notice again this calculation is easy because \(\mathbf{U}\) is upper triangular. We start at the bottom and calculate the last element of \(\vec{x}\). We then move up a row, substitute, and calculate the second to last element of \(\vec{x}\). This is called backward substitution and is the same procedure we used with Gaussian elimination last class.
4.4.7. Our First Function for Forward and Backward Solves#
Let’s return to our first factorization function LU_factor
which returned a distinct L
and U
. We will create Python code to generalize the forward and backward substitution steps to solve the linear system.
def LU_solve1(L,U,b,LOUD=True):
'''Solve L*U*x = b for x where L and U
are lower and upper triangular matrices and
b is RHS vector.
Args:
L: N by N lower triangular matrix with 1 on diagonal
U: N by N upper triangular matrix
b: N by 1 vector
Returns
x: N by 1 solution to linear system
'''
# Check dimensions
[Nrow_L, Ncol_L] = L.shape
assert Nrow_L == Ncol_L
[Nrow_U, Ncol_U] = U.shape
assert Nrow_U == Ncol_U
assert Nrow_U == Nrow_L
N = Nrow_U
assert N == b.size
# forward solve
y = np.zeros(N)
if LOUD:
print("\nForward solve...")
for r in range(0,N): # first for loop
RHS = b[r]
# loop over some of the columns to evaluated summation
for c in range(0,r):
RHS -= y[c]*L[r,c]
if LOUD:
print("y[",r,"]=",RHS)
y[r] = RHS
# backward solve
x = np.zeros(N)
if LOUD:
print("\nBackward solve...")
for r in range(N-1,-1,-1): # second for loop
RHS = y[r]
# loop over some of the columns to evaluated summation
for c in range(r+1,N):
RHS -= x[c]*U[r,c]
if LOUD:
print("x[",r,"]=",RHS/U[r,r])
x[r] = RHS/U[r,r]
return x
Let’s take a minute to look at the first and second for loops in LU_solve1
.
N = 5
# loop 1
print("Loop 1...")
for i in range(0,N):
print(i)
# loop 2
print("\nLoop 2...")
for i in range(N-1,-1,-1):
print(i)
Loop 1...
0
1
2
3
4
Loop 2...
4
3
2
1
0
Home Activity
Why does the first for loop iterate over range(0,N) and the second for loop iterate over range(N-1,-1,-1) in LU_solve1? Write a sentence below.
Answer:
Class Activity
Discuss the code above with a partner. We’ll then regroup as a class.
And let’s test with the example above.
A = np.array([(3.0,2,1),(-1,4,5),(2,-8,10)])
b = np.array([6,8,4])
print("The original matrix is\n",A)
L,U = LU_factor1(A,False)
print("L = \n",L)
print("\nU= \n",U)
x = LU_solve1(L,U,b)
The original matrix is
[[ 3. 2. 1.]
[-1. 4. 5.]
[ 2. -8. 10.]]
L =
[[ 1. 0. 0. ]
[-0.33333333 1. 0. ]
[ 0.66666667 -2. 1. ]]
U=
[[ 3. 2. 1. ]
[ 0. 4.66666667 5.33333333]
[ 0. 0. 20. ]]
Forward solve...
y[ 0 ]= 6
y[ 1 ]= 10.0
y[ 2 ]= 20.0
Backward solve...
x[ 2 ]= 1.0
x[ 1 ]= 1.0
x[ 0 ]= 1.0
And now verify the solution.
print("A*x =",A.dot(x))
print("b =",b)
A*x = [6. 8. 4.]
b = [6 8 4]
4.4.8. Another Implementation (Optional)#
Now let’s consider the case where L
and U
are stored in the same matrix. This complements our more memory efficient function LU_factor
.
def LU_solve(A,b):
"""Take a LU factorized matrix and
solve it for RHS b
Args:
A: N by N array that has been LU factored with
assumed 1's on the diagonal of the L matrix
b: N by 1 array of righthand side
Returns:
x: N by 1 array of solutions
"""
[Nrow, Ncol] = A.shape
assert Nrow == Ncol
N = Nrow
x = np.zeros(N)
#temporary vector for L^-1 b
y = np.zeros(N)
#forward solve
for row in range(N):
RHS = b[row]
for column in range(0,row):
RHS -= y[column]*A[row,column]
y[row] = RHS
#back solve
for row in range(N-1,-1,-1):
RHS = y[row]
for column in range(row+1,N):
RHS -= x[column]*A[row,column]
x[row] = RHS/A[row,row]
return x
Let’s test on the same system as above.
#let's try it on a 3 x 3 to start
A = np.array([(3.0,2,1),(-1,4,5),(2,-8,10)])
Aorig = A.copy()
print("Originally A =\n",Aorig)
LU_factor(A)
print("\nAfter factorization A =\n",A)
b = np.array([6,8,4])
x = LU_solve(A,b)
print("\nThe solution to the system is",x)
Originally A =
[[ 3. 2. 1.]
[-1. 4. 5.]
[ 2. -8. 10.]]
After factorization A =
[[ 3. 2. 1. ]
[-0.33333333 4.66666667 5.33333333]
[ 0.66666667 -2. 20. ]]
The solution to the system is [1. 1. 1.]
Let’s resolve for \(\vec{x} = [2, 12, 2019]^T\)
solution2 = np.array([2,12,2019.])
b = np.dot(Aorig,solution2)
x = LU_solve(A,b)
print("The solution to the system is",x)
The solution to the system is [2.000e+00 1.200e+01 2.019e+03]
4.4.9. LU with Pivoting#
Pivoting with LU is needed in the same cases as it is in Gaussian elimination. However, when we switch equations around, we need to make sure that we keep track of where we’ve switched rows, so we can do the same to the right-hand side when we do our forward and back solves. This complicates things considerably, but only because we have to add that bookkeeping to our algorithm.
First, we’ll need to be able to swap rows in the matrix:
def swap_rows(A, a, b):
"""Rows to rows in a matrix, switch row a with row b
args:
A: matrix to perform row swaps on
a: row index of matrix
b: row index of matrix
returns:
nothing
side effects:
changes A to have rows a and b swapped
"""
assert (a>=0) and (b>=0)
N = A.shape[0] #number of rows
assert (a<N) and (b<N) #less than because 0-based indexing
temp = A[a,:].copy()
A[a,:] = A[b,:].copy()
A[b,:] = temp.copy()
Next, we’ll change our algorithm to handle pivoting and keep track of the pivots:
def LU_factor_pivot(A,LOUD=True):
"""Factor in place A in L*U=A.
The lower triangular parts of A
are the L matrix.
The L has implied ones on the diagonal.
Args:
A: N by N array
Returns:
a vector holding the order of the rows,
relative to the original order
Side Effects:
A is factored in place.
"""
# Check dimensions
[Nrow, Ncol] = A.shape
assert Nrow == Ncol
N = Nrow
# Create scale factors
s = np.zeros(N)
count = 0
row_order = np.arange(N)
for row in A:
# Save largest absolute element in each row
s[count] = np.max(np.fabs(row))
count += 1
if LOUD:
print("s =",s)
if LOUD:
print("Original Matrix is\n",A)
# loop over columns
for column in range(0,N):
#swap rows if needed
largest_pos = np.argmax(np.fabs(A[column:N,column]/
s[column])) + column
if (largest_pos != column):
if (LOUD):
print("Swapping row",column,"with row",
largest_pos)
print("Pre swap\n",A)
swap_rows(A,column,largest_pos)
#keep track of changes to RHS
tmp = row_order[column]
row_order[column] = row_order[largest_pos]
row_order[largest_pos] = tmp
#re-order s
tmp = s[column]
s[column] = s[largest_pos]
s[largest_pos] = tmp
if (LOUD):
print("A =\n",A)
for row in range(column+1,N):
mod_row = A[row]
factor = mod_row[column]/A[column,column]
mod_row = mod_row - factor*A[column,:]
#put the factor in the correct place in the modified row
mod_row[column] = factor
#only take the part of the modified row we need
mod_row = mod_row[column:N]
A[row,column:N] = mod_row
return row_order
Now let’s verify with a 4x4 system
#let's try it on a 4 x 4 to start
A = np.array([(3.0,2,1,-2),(-1,4,5,4),(2,-8,10,3),(-2,-8,10,0.1)])
print("A =\n",A,"\n")
answer = np.ones(4)
print("True solution. x =",answer,"\n")
b = np.dot(A,answer)
print("Thus b =",b,"\n")
# Apply LU with pivoting
row_order = LU_factor_pivot(A,False)
# Print A and the row order
print("After factorization A =\n",A,"\n")
print("The new row order is",row_order)
A =
[[ 3. 2. 1. -2. ]
[-1. 4. 5. 4. ]
[ 2. -8. 10. 3. ]
[-2. -8. 10. 0.1]]
True solution. x = [1. 1. 1. 1.]
Thus b = [ 4. 12. 7. 0.1]
After factorization A =
[[ 3. 2. 1. -2. ]
[ 0.66666667 -9.33333333 9.33333333 4.33333333]
[-0.33333333 -0.5 10. 5.5 ]
[-0.66666667 0.71428571 0.4 -6.52857143]]
The new row order is [0 2 1 3]
We need to change the LU_solve function to take advantage of the swapped rows.
def LU_solve_pivot(A,b,row_order):
"""Take a LU factorized matrix and solve it for RHS b
Args:
A: N by N array that has been LU factored with
assumed 1's on the diagonal of the L matrix
b: N by 1 array of righthand side
row_order: list giving the re-ordered equations
from the the LU factorization with pivoting
Returns:
x: N by 1 array of solutions
Side Effect:
Reorders b using row_order
"""
[Nrow, Ncol] = A.shape
assert Nrow == Ncol
assert b.size == Ncol
assert row_order.max() == Ncol-1
N = Nrow
#reorder the equations
tmp = b.copy()
for row in range(N):
b[row_order[row]] = tmp[row]
#b now has the correct order
# We can either reuse our LU_solve code from above now that b has
# been reorded or copy and paste
return LU_solve(A,b)
'''
x = np.zeros(N)
#temporary vector for L^-1 b
y = np.zeros(N)
#forward solve
for row in range(N):
RHS = b[row]
for column in range(0,row):
RHS -= y[column]*A[row,column]
y[row] = RHS
#back solve
for row in range(N-1,-1,-1):
RHS = y[row]
for column in range(row+1,N):
RHS -= x[column]*A[row,column]
x[row] = RHS/A[row,row]
return x
'''
print("A =\n",A,"\n\nRow order =",row_order,"\n")
x = LU_solve_pivot(A,b,row_order)
print("The solution to the system is",x)
A =
[[ 3. 2. 1. -2. ]
[ 0.66666667 -9.33333333 9.33333333 4.33333333]
[-0.33333333 -0.5 10. 5.5 ]
[-0.66666667 0.71428571 0.4 -6.52857143]]
Row order = [0 2 1 3]
The solution to the system is [1. 1. 1. 1.]
Class Activity
Verify that our LU decomposition with pivoting code works for other test cases.
epsilon = 1e-14
A_eps = np.array([(epsilon,-1,1),(-1,2,-1),(2,-1,0)])
print(A_eps)
b_eps = np.array([0,0,1.0])
print(b_eps)
Asave = A_eps.copy()
bsave = b_eps.copy()
[[ 1.e-14 -1.e+00 1.e+00]
[-1.e+00 2.e+00 -1.e+00]
[ 2.e+00 -1.e+00 0.e+00]]
[0. 0. 1.]
row_order = LU_factor_pivot(A_eps,False)
x_eps = LU_solve_pivot(A_eps,b_eps,row_order)
print("The solution is",x_eps)
print("The residual is",bsave - np.dot(Asave,x_eps))
The solution is [1. 1. 1.]
The residual is [0.00000000e+00 1.11022302e-16 0.00000000e+00]
4.4.10. Take Home Messages#
LU factorization is very similar to Gaussian elimination:
\(\mathbf{L}\) matrix saves the scaling factors from Gaussian elimination
\(\mathbf{U}\) matrix is the same as the reduced matrix
\(\mathbf{L}\) and \(\mathbf{U}\) save all of the steps from matrix reduction, making LU factorization efficient to resolve \(\mathbf{A} \vec{x} = \vec{b}\) for many different \(\vec{b}\)
Partial pivoting works the same way for both LU factorization and Gaussian elimination
Solving a linear system with LU factorization is easy:
First, perform forward substitution, which exploits that \(\mathbf{L}\) is lower triangular
Second, perform backward substitution, which exploits that \(\mathbf{U}\) is upper triangular