4.7. Linear Algebra Review and SciPy Basics#
Reference: Section 2.2 Vectors and Matrices in Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes by L. T. Bielger (2010).
Recommended Links
SciPy Lecture Notes (Especially 1.1 - 1.6 for everyone and 2.1 - 2.2 for advanced users)
Tip: We will mostly use SciPy for linear algebra. (It has more sophisticated capabilities than NumPy.) We will use NumPy if/when SciPy does not offer a specific command.
4.7.1. Learning Objectives#
After studying this notebook, completing the activities, and asking questions in class, you should be able to:
Understand matrix operations on paper and in python.
Explain what a determinant is and how to calculate it; know its properties.
Calculate the rank of a matrix by hand and in python.
Know the conditions for a matrix to have an inverse, and how to calculate its inverse.
Understand how to solve linear systems in various ways, including LU decompostion.
Be familiar with the invertible matrix theorem.
Calculate and interpret eigenvalues.
Perform singular value decomposition and interpret the results.
Calculate vector and matrix norms.
# Load required Python libraries.
import matplotlib.pyplot as plt
import numpy as np
from scipy import linalg
4.7.2. Notation for CBE 40499/60499 and Bielger (2010)#
The following is somewhat standard notation in scientific computing and data science literature:
Scalars (constants and variables): Lowercase Greek letters (more common), lowercase non-bold Roman letters (especially if dictated by science or engineering context/domain norms)
CBE 40499/60499: We will sometimes use
Vectors - Lower case bold Roman letter, lowercase bold Greek letters (if dictated by science or engineering domain norms)
Matrices - Upper case bold Roman letters, uppercase bold Greek letters (if dictated by science and engineering domain norms)
The notation for lectures will (hopefully) be obvious from context. I sometimes use Greek letters for constants and Roman letters for variables… unless the Greek letters have engineering or scientific meaning.
4.7.3. Matrix Operations#
In this section we will introduce properties of matrices with code examples following each property.
4.7.3.1. Matrix Multiplication#
Multiplying two matrices \(A\) \(\in\) \(\mathbb{R}^{n \times m}\) and \(B\) \(\in\) \(\mathbb{R}^{m \times p}\) leads to the product \(C \in\) \(\mathbb{R}^{n \times p}\) with elements \( \mathbf{\{C\}_{ij}}\) = \(\sum_{k=1}^m \mathbf{\{A\}_{ik}} \mathbf{\{B\}_{kj}}\). This operation is defined only if the number of columns in \(A\) and rows of \(B\) is the same.
Home Activity
The code below demonstrates matrix multiplication. Before running the code below, what is the dimension of A*B?
# Generate random matrices
n = 2
m = 3
Ao = np.random.rand(n,m)
Bo = np.random.rand(m,n)
print("Ao = ")
print(Ao)
print("\nBo = ")
print(Bo)
# Calculate Ao*Bo using Python
print("\nAo*Bo =")
print(Ao.dot(Bo))
Another way of multiplying matrices is by using the “@” symbol. See an example below with Ao and Bo.
# calculate Ao*Bo using @
print("Ao*Bo =")
print(Ao@Bo)
Ao*Bo =
[[0.526524 0.84002501]
[0.87286582 1.33610749]]
4.7.3.2. Matrix Transpose#
The transpose of \(A\) \(\in\) \(\mathbb{R}^{n \times m}\) is \(A^T\) \(\in\) \(\mathbb{R}^{m \times n}\) with the rows and columns of A interchanged, i.e.\( \mathbf{\{A^T\}_{ij}} = \mathbf{\{A\}_{ji}}\).
Home Activity
Calculate \(A_o^{T}\)
# Transpose
# Add your solution here
4.7.3.3. Symmetric Matrix#
A symmetric matrix \(A\) \(\in\) \(\mathbb{R}^{n \times n}\) satisfies \(A = A^T\).
4.7.3.4. Diagonal Matrix#
A diagonal matrix \(A\) \(\in\) \(\mathbb{R}^{n \times n}\) has nonzero elements only on the diagonal, i.e., \( \mathbf{\{A\}_{ij}} = 0, i \neq j\).
Home Activity
Create a random \(3 \times 3\) square matrix and check if it is symmetric by comparing it to its transpose.
# Create square matrix and compare with transpose
# Add your solution here
4.7.3.5. Identity Matrix#
The identity matrix \(I \in \mathbb{R}^{n \times n}\) is defined as
Home Activity
Create a 3x3 identity matrix.
# Identity matrix
# Add your solution here
4.7.4. Determinant#
A determinant is a scalar associated to a square matrix, i.e., det(\(A\)) for \(A\) \(\in\) \(\mathbb{R}^{n \times n}\), and can be defined recursively by det(\(A\)) = \(\sum_i (-1)^{i+j} \mathbf{\{A\}_{ij}} \bar{A}_{ij}\) for any column \(j\), or det(\(A\)) = \(\sum_j (-1)^{i+j} \mathbf{\{A\}_{ij}}\bar{A}_{ij}\) for any row \(i\), where \(\bar{A}_{ij}\) is the determinant of an (\(n\)-1)-order matrix with row \(i\) and column \(j\) removed.
We will now manipulate the following matrices to demonstrate properties for the determinant.
# Generate random matrices
nd = 2
A = np.random.rand(nd,nd)
B = np.random.rand(nd,nd)
print("A = ")
print(A)
print("\nB = ")
print(B)
A =
[[0.13785899 0.28113758]
[0.59708216 0.31166715]]
B =
[[0.5493066 0.04429576]
[0.2740967 0.54012404]]
4.7.4.1. Determinant of Matrix Product#
det(\(AB\)) = det(\(A\))det(\(B\))
Class Activity
Calculate the determinante of \(AB\) and prove it is equal to the product of the determinants of \(A\) and \(B\).
# Property 1
# Add your solution here
4.7.4.2. Determinant of Transpose#
det(\(A\)) = det(\(A^T\))
Class Activity
Calculate the determinant of \(A\) and of \(A^T\) and prove they are equal.
# Property 2
# Add your solution here
4.7.4.3. Determinant with Scalar Product#
For an \(n \times n\) matrix with scalar \(\alpha\), det(\(\alpha A\)) = \(\alpha^n\) det(\(A\))
Class Activity
Calculate the determinant of \(\alpha A\) and calculate \(\alpha\)det(\(A\)) and prove they are equal. Let \(\alpha\) = 4.
# Property 3
# Add your solution here
4.7.4.4. Determinant of Identify Matrix#
For an identity matrix \(I\), det(\(I\)) = 1.
Class Activity
Create a \(3 \times 3\) identity matrix and calculate the determinant.
# Property 4
# Add your solution here
4.7.5. Calculating \(3 \times 3\) Determinant by Hand#
Below is an example from Linear Algebra and Its Applications by David C. Lay, 3rd edition that shows how to calculate a determinant by hand.
Home Activity
Walk through the following example that calculates a \(3 \times 3\) determinant by hand.
For \( n \geq 2\), the determinant of an \(n \times n\) matrix \(A\) = [\(a_{ij}\)] is the sum of \(n\) terms of the form \(\pm a_{1j}\) det\(A_{1j}\), with plus and minus signs alternating, where the entries \(a_{11},a_{12},...,a_{1n}\) are from the first row of \(A\). In symbols,
det\(A\) = \(a_{11}\)det\(A_{11}-a_{12}\)det\(A_{12}+...+(-1)^{1+n}a_{1n}\)det\(A_{1n}\)
= \(\sum_{j=1}^n (-1)^{1+j}a_{1j}\text{det}A_{1j}\)
Example: Compute the determinate of the matrix \(A\) = \( \begin{bmatrix} 1 & 5 & 0\\ 2 & 4 & -1\\ 0 & -2 & 0 \end{bmatrix}\)
Solution: Compute det(\(A\)) = \(a_{11}\)det\(A_{11}-a_{12}\)det\(A_{12}+a_{13}\)det\(A_{13}\):
det\(A\) = \(1\cdot\)det\(\begin{bmatrix} 4 & -1\\ -2 & 0 \end{bmatrix}\) - \(5\cdot\)det\(\begin{bmatrix} 2 & -1\\ 0 & 0 \end{bmatrix}\) + \(0\cdot\)det\(\begin{bmatrix} 2 & 4\\ 0 & -2 \end{bmatrix}\)
\(= 1(0-2) - 5(0-0) + 0(-4-0)\)
\(= -2\)
4.7.6. Rank#
4.7.6.1. Matrix Rank Definition#
Rerference: Section 2.2 of Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes, L. Biegler (2010)
The rank of a matrix \(r(A)\) is the order of the largest submatrix in \(A\) that has a nonzero determinant.
A matrix \(A\) \(\in\) \(\mathbb{R}^{n \times m}\) has full row (column) rank if none of the row (column) vectors can be written as a linear combination of the other row (column) vectors. The rank of such a matrix is \(r(A)\) = min(\(n,m\)).
A matrix that does not have full rank is rank deficient.
4.7.6.2. Practice Activities#
Home Activity
Predict the rank of \(A_1^T\), \(B_1^T\), \(C_1\), and \(D_1\).
# Generate random matrices
n = 2
m = 3
A_1 = np.random.rand(n,m)
B_1 = np.random.rand(m,n)
C_1 = np.array([[0, 2, 0], [1, 2, 3], [0, 4, 0]])
D_1 = np.array([[3, 2, 4], [1, 2, 4], [0, 6, 5]])
print("A_1 = ")
print(A_1)
print("\nB_1 = ")
print(B_1)
print("\nC_1 = ")
print(C_1)
print("\nD_1 = ")
print(D_1)
A_1 =
[[0.70268875 0.55298945 0.56890907]
[0.46535378 0.01446308 0.23871531]]
B_1 =
[[0.04058466 0.17097646]
[0.81132754 0.88760492]
[0.75098347 0.33147803]]
C_1 =
[[0 2 0]
[1 2 3]
[0 4 0]]
D_1 =
[[3 2 4]
[1 2 4]
[0 6 5]]
My answers:
Rank of A_1T:
Rank of B_1T:
Rank of C_1:
Rank of D_1:
Home Activity
Check the rank of these four matrices by calculating each with np.linalg.matrix_rank( ). Were you correct?
# Calculate ranks
# Add your solution here
4.7.7. Inverse#
4.7.7.1. Inverse of a Matrix Defnition#
Rerference: Section 2.2 of Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes, L. Biegler (2010)
If the matrix \(A\) \(\in\) \(\mathbb{R}^{n \times n}\) is nonsingular, i.e., has full rank, then an inverse \(A^{-1}\) exists that satisfies \(AA^{-1} = A^{-1}A = I\).
The linear system \(Ax = b\) with nonsingular \(A\) \(\in\) \(\mathbb{R}^{n \times m}\), \(x \in \mathbb{R}^n\), and \(b \in \mathbb{R}^n\), has a unique solution \(x = A^{-1}b\).
det(\(A^{-1}\)) = \(\frac{1}{\text{det}(A)}\).
A matrix \(Q \in \mathbb{R}^{n \times n}\) with the property \(Q^{-1} = Q^T\) is called an orthogonal matrix.
4.7.7.2. Practice Activities#
Home Activity
Calculate the inverse of \(A_o\) and verity that \(A_o^{-1} A_o = I\).
# Generate A_0
n = 3
m = 3
A_0 = np.random.rand(n,m)
print("A_0 = ")
print(A_0)
# Calculate inverse
# Add your solution here
# Verification
# Add your solution here
What do you notice about the matrix produced from \(A_o^{-1} A_o\)? Why do you think this happens?
Home Activity
Verify that det(\(A_0^{-1}\)) = \(\frac{1}{\text{det}(A_0)}\).
# Verification
# Add your solution here
Home Activity
Is \(Q\) an orthogonal matrix? Yes, no, or need more information?
# Create Q
theta = 0.2
Q = np.array([(np.cos(theta), -np.sin(theta)),(np.sin(theta),np.cos(theta))])
print("Q = ")
print(Q)
# Verification
# Add your solution here
4.7.8. Solving Linear Systems#
Consider the linear system \(A_1 x = b_1\)
A1 = np.array([(4,3),(6,3)])
b1 = np.array([1,0])
print("A1 = \n",A1)
print("\nb1 = \n",b1)
A1 =
[[4 3]
[6 3]]
b1 =
[1 0]
4.7.8.1. Explicit Inverse#
Calculate \(x\) by explicitly using \(A_l^{-1}\). Hint: Use linalg.inv( ).
# Calculate inverse
# Add your solution here
# Now calculate and print x
# Add your solution here
4.7.8.2. LU Decomposition#
# Create test linear systems for the following examples
A = np.random.rand(4,4)
print("A = \n",A,"\n")
b = np.random.rand(4,1)
print("b = \n",b,"\n")
A =
[[0.19532048 0.80213917 0.0896026 0.7891849 ]
[0.51242877 0.60202691 0.04460208 0.99673738]
[0.66205227 0.74912751 0.64881305 0.5885614 ]
[0.65721088 0.25512709 0.60784324 0.00341296]]
b =
[[0.81894293]
[0.61659727]
[0.35223602]
[0.82622819]]
Home Activity
Follow along with the LU decomposition on \(A\). What structures do the \(P\), \(L\), and \(U\) matrices have?
# Perform LU decomposition
(P, L, U) = linalg.lu(A)
# Permutation matrix
print("P = \n",P)
# Lower diagonal matrix
print("L = \n",L)
# Upper diagonal matrix
print("U = \n",U)
# Verify result
print("P*L*U = \n",P.dot(L.dot(U)),"\n")
P =
[[0. 1. 0. 0.]
[0. 0. 1. 0.]
[1. 0. 0. 0.]
[0. 0. 0. 1.]]
L =
[[ 1. 0. 0. 0. ]
[ 0.29502275 1. 0. 0. ]
[ 0.77400047 0.03820467 1. 0. ]
[ 0.99268729 -0.84064268 0.26849349 1. ]]
U =
[[ 0.66205227 0.74912751 0.64881305 0.5885614 ]
[ 0. 0.58112951 -0.10181202 0.6155459 ]
[ 0. 0. -0.45368983 0.51767385]
[ 0. 0. 0. -0.20238236]]
P*L*U =
[[0.19532048 0.80213917 0.0896026 0.7891849 ]
[0.51242877 0.60202691 0.04460208 0.99673738]
[0.66205227 0.74912751 0.64881305 0.5885614 ]
[0.65721088 0.25512709 0.60784324 0.00341296]]
4.7.8.3. Is P orthogonal?#
print("inv(P) = \n",linalg.inv(P),"\n")
print("transpose(P) = \n",P.T,"\n")
print("P.T*P = \n",np.matmul(P.T,P),"\n")
inv(P) =
[[-0. 0. 1. -0.]
[ 1. 0. 0. -0.]
[ 0. 1. 0. -0.]
[ 0. 0. 0. 1.]]
transpose(P) =
[[0. 0. 1. 0.]
[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 0. 1.]]
P.T*P =
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]
Yes! We see that \(P^T = P^{-1}\) and \(P^T \cdot P = I\).
4.7.8.4. MATLAB#
Defines LU decomposition as follows:
Consider
Using MATLAB LU definition
Substitute into linear system:
Let \(y = U \cdot x\) and substitute.
Step 1. Solve \(L \cdot y = P \cdot b\) for \(y\)
Step 2. Solve \(U \cdot x = y\) for \(x\)
# Calculate Pb
Pb = P.dot(b)
print("P*b = \n",Pb,"\n")
# Step 1
yLU = linalg.solve(L,Pb)
print("yLU = \n",yLU,"\n")
# Step 2
xLU = linalg.solve(U,yLU)
print("xLU = \n",xLU,"\n")
P*b =
[[0.61659727]
[0.35223602]
[0.81894293]
[0.82622819]]
yLU =
[[0.61659727]
[0.1703258 ]
[0.33518912]
[0.26732696]]
xLU =
[[ 2.83715187]
[ 1.29873196]
[-2.24599425]
[-1.32090047]]
4.7.8.5. SciPy#
Defines LU decomposition as:
Consider
SciPy LU definition
Substitute into linear system:
Let \(y = U \cdot x\) and substitute.
Step 1. Solve \(L \cdot y = P^T \cdot b\) for \(y\)
Step 2. Solve \(U \cdot x = y\) for \(x\)
## LU decomposition algorithm.
Pb = (P.T).dot(b)
print("P.T*b = \n",Pb,"\n")
# Step 1
yLU = linalg.solve(L,Pb)
print("yLU = \n",yLU,"\n")
# Step 2
xLU = linalg.solve(U,yLU)
print("xLU = \n",xLU,"\n")
P.T*b =
[[0.35223602]
[0.81894293]
[0.61659727]
[0.82622819]]
yLU =
[[0.35223602]
[0.71502529]
[0.31664912]
[0.99263052]]
xLU =
[[ 5.03790151]
[ 5.32285245]
[-6.29438561]
[-4.90472838]]
What is the difference? A transpose.
4.7.8.6. Verify our answer with linalg.solve
#
Solve the linear system using linalg.solve
x = linalg.solve(A,b)
print("x = \n",x)
x =
[[ 5.03790151]
[ 5.32285245]
[-6.29438561]
[-4.90472838]]
4.7.9. Invertable Matrix Theorem#
More Information:
Refernce: Linear Algebra and Its Applications by David C. Lay, 3rd edition
Hold on to your hats folks!
Let \(A\) be a square \(n \times n\) identity matrix. Then the following statements are equivalent. That is, for a given \(A\), the statements are either all true or all false.
a. \(A\) is an invertible matrix.
b. \(A\) is row equivalent to the \(n \times n\) identity matrix.
c. \(A\) has \(n\) pivot positions.
d. The equation \(Ax = 0\) has only the trivial solution.
e. The columns of \(A\) form a linearly independent set.
f. The linear transformation \(x\longmapsto Ax\) is one-to-one.
g. The equation \(Ax = b\) has at least one solution for each \(b\) in \(\mathbb{R}^n\).
h. The columns of \(A\) span \(\mathbb{R}^n\).
i. The linear transformation \(x\longmapsto Ax\) maps \(\mathbb{R}^n\) onto \(\mathbb{R}^n\).
j. There is an \(n \times n\) matrix \(C\) such that \(CA = I\).
k. There is an \(n \times n\) matrix \(D\) such that \(AD = I\).
l. \(A^T\) is an invertible matrix.
Let \(A\) be an \(n \times n\) matrix. Then the following statements are each equivalent to the statement that \(A\) is an invertible matrix.
m. The columns of \(A\) form a basis of \(\mathbb{R}^n\).
n. Col\(A\) = \(\mathbb{R}^n\)
o. dim Col\(A\) = \(N\)
p. rank \(A\) = \(N\)
q. Nul\(A\) = \(\{0\}\)
r. dim Nul\(A\) = \(0\)
Let \(A\) be an \(n \times n\) matrix. Then \(A\) is invertible if and only if:
s. The number 0 is not an eigenvalue of \(A\).
t. The determinant of \(A\) is not zero.
Let \(A\) be an \(n \times n\) matrix. Then the following statements are each equivalent to the statement that \(A\) is an invertible matrix.
u. (Col\(A\))\(^\perp\) = \(\{0\}\).
v. (Nul\(A\))\(^\perp\) = \(\mathbb{R}^n\).
w. Row\(A\) = \(\mathbb{R}^n\).
x. \(A\) has \(n\) nonzero singular values.
4.7.10. Eigenvectors and Eigenvalues#
4.7.10.1. Eigenvalues/Eigenvectors Definition#
For a square matrix \(A \in \mathbb{R}^{n\times n}\), the eigenvalue \(\lambda\) and the corresponding eigenvector \(v \neq 0\) satisfy \(Av = \lambda v\).
Because \((A-\lambda I)v = 0\), \((A-\lambda I)\) is singular, and we can define a characteristic \(n\)th degree polynomial given by det\((A-\lambda I)\) = \(0\) to find the \(n\) eigenvalues.
det\((A)\) = \(\prod_{i=1}^n\lambda_i\).
Eigenvalues of symmetric matrices are always real. Moreover, by collecting eigenvectors and eigenvalues, a symmetric matrix \(A \in \mathbb{R}^{n \times n}\) can be represented by \(A = V\wedge V^T\), where \(V\) is the orthogonal matrix of eigenvectors, \(\mathbf{V = [v_1,v_2,...,v_n]}\), and \(\wedge\) is a diagonal matrix of eigenvalues with \( \mathbf{\{\wedge\}_{ii} = \lambda_i}\).
Nonsymmetric matrices can have complex eigenvalues. Also, eigenvalues are not defined for nonsquare matrices. Instead, singular values \(\sigma_i \geq 0\), \(i = 1,...,m,\) are often useful to characterize \(A \in \mathbb{R}^{n \times m}\) with \(m \leq n\). These are derived from the eigenvalues of \((A^TA)\) with \(\sigma_i = [\lambda_i(A^TA)]^{\frac{1}{2}}\).
Consider the matrix \(A \in \mathbb{R}^{n \times m}\) with all real eigenvalues. Such matrices can be further classified as follows:
Matrices with \(\lambda_i > 0, i = 1,n,\) are said to be positive definite; i.e., \(y^TAy > 0\) for all \(y \neq 0\).
Matrices with \(\lambda_i < 0, i = 1,n,\) are said to be negative definite; i.e., \(y^TAy < 0\) for all \(y \neq 0\).
Matrices with \(\lambda_i \geq 0, i = 1,n,\) are said to be positive semidefinite; i.e., \(y^TAy \geq 0\) for all \(y \neq 0\).
Matrices with \(\lambda_i \leq 0, i = 1,n,\) are said to be negative semidefinite; i.e., \(y^TAy \leq 0\) for all \(y \neq 0\).
Matrices with both positive and negative eigenvalues are indefinite.
A matrix with \(\lambda_i = 0\), for some \(i\), is singular.
4.7.10.2. Practice Activities#
# Create matrix for activities
Ae = np.array([(0, -4, -6), (-1, 0, -3), (1, 2, 5)])
print("Ae = \n",Ae)
Ae =
[[ 0 -4 -6]
[-1 0 -3]
[ 1 2 5]]
Home Activity
Do you expect the above matrix to have negative eigenvalues?
Choose: Yes/No/Cannot tell from inspection.
Justification:
Class Activity
Calculate the eigenvalues and corresponding eigenvectors.
# Calculations
# Add your solution here
Class Activity
Based on your calculations above, is this matrix i) positive definite, ii) positive semi definite, iii) negative definite, iv) negative semi definite, v) indefinite, or vi) cannot say without additional calculations? Write a sentence to justify your answer.
Answer:
Class Activity
Calculate the determinate using only the eigenvalues.
# Determinate calculation
# Add your solution here
4.7.11. Singular Value Decomposition#
Refer to this notebook for notes on singular value decomposition.
Remember, SVD can be thought of as another type of matrix factorization. We can represent an \(m \times n\) matrix as \(A = U\cdot \sum \cdot V^T\).
Class Activity
Calculate the singular value decomposition of \(A_d\) and \(B_d\).
# Generate matrices
n = 2
m = 3
A_d = np.random.rand(n,m)
B_d = np.random.rand(m,n)
print("A_d = ")
print(A_d)
print("\nB_d = ")
print(B_d)
# Add your solution here
A_d =
[[0.4485645 0.94396271 0.90630531]
[0.65550619 0.17113477 0.79723747]]
B_d =
[[0.70939505 0.75872279]
[0.68693568 0.29736702]
[0.9430575 0.19822265]]
SVD for A_d...
U =
[[-0.81497106 -0.57950166]
[-0.57950166 0.81497106]]
S =
[1.65873224 0.50678934]
V =
[[-0.44939984 -0.52357768 -0.72381365]
[ 0.54120061 -0.80419622 0.24570374]
[-0.7107332 -0.28130917 0.64476622]]
SVD for B_d...
U =
[[-0.64065863 0.73692217 0.21564379]
[-0.48291242 -0.1683662 -0.85933021]
[-0.59695236 -0.65467438 0.46373413]]
S =
[1.54265526 0.43367475]
V =
[[-0.87457662 -0.48488734]
[-0.48488734 0.87457662]]
Class Activity
What is the rank of each matrix? (from inspecting singular values?)
Answer:
4.7.12. Vector and Matrix Norms#
Rerference: Section 2.2 of Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes, L. Biegler (2010)
4.7.12.1. Vector Norm Properties#
A vector norm \(\parallel \cdot \parallel\) is a measure of the length of a vector and it maps from \(\mathbb{R}^n\) to a nonegative real that has the following properties:
\(\parallel x\parallel = 0 \Rightarrow x=0\) for \(x \in \mathbb{R}^n\),
\(\parallel \alpha x\parallel = |\alpha | \parallel x\parallel\) for all \(x \in \mathbb{R}^n\), \(\alpha \in \mathbb{R}^n\),
\(\parallel x+y\parallel \; \leq \; \parallel x\parallel + \parallel y\parallel\) for all \(x,y \in \mathbb{R}^n\).
4.7.12.2. P-norm Properties#
The p-norm is defined by \(\parallel x\parallel _p = (\sum_{i=1}^n |x_i|^p)^{\frac{1}{p}}\), where \(p \geq 1\). Examples and properties include:
\(p=1, \parallel x\parallel_1 = \sum_{i=1}^n |x_i|\), the 1-norm,
\(p=2, \parallel x\parallel_2 = (\sum_{i=1}^n x_i^2)^{\frac{1}{2}}\), the Euclidean norm,
\(p=\infty , \parallel x\parallel_{\infty}\) = max\(_i |x_i|\), the max-norm,
\(|x^Ty| \leq \; \parallel x\parallel_p \parallel y\parallel_q\) for 1/\(p\) + 1/\(q\) = 1 and all \(x, y \in \mathbb{R}^n\).
Also, for \(x \in \mathbb{R}^n\) each norm can be bounded above and below by a multiple of any of the other norms.
Class Activity
Verify the vector norm properties hold for the p-2 norm using \(x\) and \(y\). Hint: linalg.norm( )
# Create vectors for following activites
x = np.random.rand(3,1)
print("x = \n",x)
y = np.random.rand(3,1)
print("y = \n",y)
x =
[[0.22147533]
[0.95260672]
[0.21153669]]
y =
[[0.75695275]
[0.33364788]
[0.28968871]]
# Verification of p-norm property #2
# Add your solution here
# Verification of p-norm property #4
# Add your solution here
4.7.12.3. Additional Matrix Norms#
For \(A \in \mathbb{R}^{n \times m}\), the induced matrix norms are defined by vector norms: \(\parallel A\parallel\) = max\(_{X \neq 0} \frac{\parallel Ax \parallel}{\parallel x \parallel}\) for any p-norm. It is therefore consistent with vector norms and satisfies \(\parallel A\parallel \; \parallel x\parallel \; \geq \; \parallel Ax\parallel\). Examples include:
\(\parallel A\parallel_1\) = max\(_j(\sum_{i=1}^n |\{A\}_{ij}|)\) (max column sum of \(A\))
\(\parallel A\parallel_{\infty}\) = max\(_i(\sum_{j=1}^m |\{A\}_{ij}|)\) (max row sum of \(A\))
\(\parallel A\parallel_2\) = largest singular value of \(A\).
Class Activity
Calculate \(||A_l||_1\), \(||A_l||_{\infty}\) and \(||A_l||_2\) using both the rules given above and SciPy
# Create A_1 matrix for following activities
n = 2
m = 3
A_1 = np.random.rand(n,m)
print(A_1)
[[0.5969 0.26938739 0.14392162]
[0.90718853 0.3676809 0.00275836]]
# Calculate ||A_l||_1
# Add your solution here
# Calculate ||A_l||_2
# Add your solution here
4.7.12.4. Frobenius Norm#
The Frobenius norm \(\parallel A\parallel_F = (\sum_i \sum_j ( \mathbf{\{A\}_{ij})^2)^{\frac{1}{2}}}\) is not consistent with a vector norm but has some useful properties. This norm as well as the induced norms satisfy \(\parallel AB\parallel \; \leq \parallel A\parallel \; \parallel B\parallel\) for any two compatible matrices \(A\) and \(B\). In addition, the induced and Frobenius matrix norms can be bounded above and below by a multiple of any of the other matrix norms.
4.7.13. Condition Number#
For more information on condition number, reference this notebook.
The condition number is defined by \(\kappa (A) = \parallel A\parallel \; \parallel A^{-1} \parallel\). Using the induced 2-norm the condition number is given by \(\frac{\sigma_{\text{max}}}{\sigma_{\text{min}}}\). If \(A\) is a symmetric matrix, then this is equivalent to using the eigenvalues of \(A\) with \(\frac{\lambda_{\text{max}}}{\lambda_{\text{min}}}\).
Refernce the SVD cell from above to answer the following two activities.
Class Activity
Estimate the condition number of \(A_d\) using the above rule.
# Condition number using rule
# Add your solution here
Class Activity
Calculate the condition number of \(A_d\) using np.linalg.cond( ).
# Condition number using numpy
# Add your solution here