4.4 LU Factorization

4.4 LU Factorization#

We learned that we can transform the linear equation with the square matrix \(A\) into a upper triangular matrix \(U\). In the lecture we learned that we can also transform the linear equation into an lower triangular matrix \(L\) and an upper triangular matrix \(U\) such that

\[ A = LU \]

This is called the LU factorization of \(A\). The LU factorization is useful for solving linear equations especially when the right hand side vector \(b\) is changed. In this case, we can solve the linear equations by solving \(Ly = b\) and \(Ux = y\). In the example above we would have to perform the whole calculation again. However, if we have the LU factorization of \(A\) we can solve the linear equations much faster.

In the lecture we learned that we can use outer products using the lower triangular matrix \(L\) divided into a part for each column. For a 3x3 matrix

\[\begin{split} A = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ \end{pmatrix}\end{split}\]

we divide the lower triangular matrix \(L\) into two parts:

\[\begin{split} L = L_1 * L_2 = \begin{pmatrix} 1 & 0 & 0 \\ a_{21}/a_{11} & 1 & 0 \\ a_{31}/a_{11} & 0 & 1 \\ \end{pmatrix} * \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & a_{32}/a_{22} & 1 \\ \end{pmatrix} \end{split}\]

Using this we can write:

\[ L_1 * L_2 * A = U \]

For a general square matrix \(A\) we can write:

\[ L_1 * L_2 * l_k * ... * L_{K} * A = U \]

Where \(L_k\) is the lower triangular matrix for the \(k\)-th column as is has only non-zero values on the diagonal and the k-th column. The \(k\)-th column of \(L_k\) is defined as:

\[ (L_k)_{ik} = \frac{a^{k-1}_{ik}}{a^{k-1}_{kk}} \]

Where \(a^{k-1}_{ik}\) is the \(i\)-th element of the \(k\)-th column of \(A^{k-1} = L_1 * L_2 * l_k * ... * L_{k-1} * A\).

We can now write a function that calculates the LU factorization of a square matrix \(A\). The function should take the matrix \(A\) as input and return the lower triangular matrix \(L\) and the upper triangular matrix \(U\).

import Pkg
Pkg.instantiate()

A = [1 2 3; 4 5 6; 7 8 10];
b = [1, 2, 3];

using LinearAlgebra
function lufact(A)
    n = size(A,1)
    L = diagm(ones(n))   # ones on main diagonal, zeros elsewhere
    U = zeros(n,n)
    Aₖ = float(copy(A))  # copy of A, converted to float

    # Reduction by outer products
    for k in 1:n-1
        U[k,:] = Aₖ[k,:]
        L[:,k] = Aₖ[:,k]/U[k,k]
        Aₖ -= L[:,k]*U[k,:]'
    end

    U[n,n] = Aₖ[n,n]
    return LowerTriangular(L),UpperTriangular(U)
end

lufact (generic function with 1 method)

Let’s test the function with the matrix \(A\) defined above. We should get the lower triangular matrix \(L\) and the upper triangular matrix \(U\).

L, U = lufact(A)

([1.0 0.0 0.0; 4.0 1.0 0.0; 7.0 2.0 1.0], [1.0 2.0 3.0; 0.0 -3.0 -6.0; 0.0 0.0 1.0])

Now to solve the linear equation we can use the LU factorization of \(A\) and solve the linear equations by solving \(Ly = b\) and \(Ux = y\). First we define a function for the forward substitution to solve \(Ly = b\). The function should take the lower triangular matrix \(L\) and the vector \(b\) as input and return the solution vector \(y\).

function forwardsub(L,b)
    n = size(L,1)
    y = zeros(n)
    y[1] = b[1]/L[1,1]
    for i in 2:n
        s = sum( L[i,j]*y[j] for j in 1:i-1 )
        y[i] = ( b[i] - s ) / L[i,i]
    end
    return y
end

forwardsub (generic function with 1 method)

Now let’s define a function for the back substitution to solve \(Ux = y\). The function should take the upper triangular matrix \(U\) and the vector \(y\) as input and return the solution vector \(x\).

function backsub(U,y)
    n = size(U,1)
    x = zeros(n)
    x[n] = y[n]/U[n,n]
    for i in n-1:-1:1
        s = sum( U[i,j]*x[j] for j in i+1:n )
        x[i] = ( y[i] - s ) / U[i,i]
    end
    return x
end

backsub (generic function with 1 method)

Finally we want to test our implementation using the following matrix \(A\) and vector \(b\):

A = [2 0 4 3; -4 5 -7 -10; 1 15 2 -4.5; -2 0 2 -13];
b = [4,9,9,4];

First we calculate the LU factorization of \(A\).

L,U = lufact(A)

([1.0 0.0 0.0 0.0; -2.0 1.0 0.0 0.0; 0.5 3.0 1.0 0.0; -1.0 0.0 -2.0 1.0], [2.0 0.0 4.0 3.0; 0.0 5.0 1.0 -4.0; 0.0 0.0 -3.0 6.0; 0.0 0.0 0.0 2.0])

Now we can solve the linear equation by solving \(Ly = b\) and \(Ux = y\).

z = forwardsub(L,b)
x = backsub(U,z)

4-element Vector{Float64}:
 192.66666666666666
 -15.533333333333335
 -65.33333333333333
 -40.0

And let’s check the error of the solution:

b - A*x

4-element Vector{Float64}:
  0.0
 -5.684341886080802e-14
  2.842170943040401e-14
  0.0

Nice! We actually found a way to solve linear equations in an efficient way. There is one caveat though. This method is not always stable and can fail for some matrices. For example if we swap the second and fourth row of \(A\) the resluting matrix \(A\) is not invertible and the LU factorization fails:

A[[2,4],:] = A[[4,2],:]  
L,U = lufact(A)
L

4×4 LowerTriangular{Float64, Matrix{Float64}}:
  1.0     ⋅      ⋅    ⋅ 
 -1.0  NaN       ⋅    ⋅ 
  0.5   Inf   NaN     ⋅ 
 -2.0   Inf   NaN    1.0

We can actually see why this fails if we take a look at the matrix \(A\). The problem is that element \(a_{22}\) is zero. In the second step we are supposed to divide by \(a_{22}\) which is zero.

4×4 Matrix{Float64}:
  2.0   0.0   4.0    3.0
 -2.0   0.0   2.0  -13.0
  1.0  15.0   2.0   -4.5
 -4.0   5.0  -7.0  -10.0

However, we know that this linear equation is actually solvable. Can we fix this?

Pivoting#

To avoid dividing by zero, we can change the order of the coulmns before we perform the elimination in each column. We will use the largest available element in the column we are working on as the pivot element - the element we divide by. This technique is known as column pivoting.

A = [2 0 4 3 ; -2 0 2 -13; 1 15 2 -4.5 ; -4 5 -7 -10]

4×4 Matrix{Float64}:
  2.0   0.0   4.0    3.0
 -2.0   0.0   2.0  -13.0
  1.0  15.0   2.0   -4.5
 -4.0   5.0  -7.0  -10.0

Using the same matirx as before the first step looks like this:

A_1 = float(copy(A))
p = fill(0,4)
i = argmax(abs.(A_1[:,1])) # find largest element 
p[1] = i # store index of largest element in p[1]
L,U = zeros(4,4),zeros(4,4)
U[1,:] = A_1[i,:] # copy row i of A to row 1 of U 
# perform elimination as before
L[:,1] = A_1[:,1]/U[1,1] 
A_2 = A_1 - L[:,1]*U[1,:]'

4×4 Matrix{Float64}:
0   2.5   0.5   -2.0
0  -2.5   5.5   -8.0
0  16.25  0.25  -7.0
0   0.0   0.0    0.0

The second step looks like this:

i = argmax(abs.(A_2[:,2])) 
p[2] = i 
U[2,:] = A_2[i,:]
L[:,2] = A_2[:,2]/U[2,2]
A_3 = A_2 - L[:,2]*U[2,:]'

4×4 Matrix{Float64}:
0  0.0  0.461538  -0.923077
0  0.0  5.53846   -9.07692
0  0.0  0.0        0.0
0  0.0  0.0        0.0

The third step looks like this:

i = argmax(abs.(A_3[:,3])) 
p[3] = i
U[3,:] = A_3[i,:]
L[:,3] = A_3[:,3]/U[3,3]
A_4 = A_3 - L[:,3]*U[3,:]'

4×4 Matrix{Float64}:
0  0.0  0.0  -0.166667
0  0.0  0.0   0.0
0  0.0  0.0   0.0
0  0.0  0.0   0.0

And the last step looks like this:

i = argmax(abs.(A_4[:,4])) 
p[4] = i
U[4,:] = A_4[i,:]
L[:,4] = A_4[:,4]/U[4,4];
L

4×4 Matrix{Float64}:
 -0.5    0.153846  0.0833333   1.0
  0.5   -0.153846  1.0        -0.0
 -0.25   1.0       0.0        -0.0
  1.0    0.0       0.0        -0.0

We can see that we can solve the linear equation by swapping the columns. However we can see that we also changed the order of the elements in the lower triangular matrix \(L\). We can fix this by keeping track of the order of the columns and swapping the elements in the lower triangular matrix \(L\) back accordingly:

L[p,:]

4×4 Matrix{Float64}:
  1.0    0.0       0.0        -0.0
 -0.25   1.0       0.0        -0.0
  0.5   -0.153846  1.0        -0.0
 -0.5    0.153846  0.0833333   1.0

We can now write a function that calculates the LU factorization of a square matrix \(A\) using column pivoting. The function should take the matrix \(A\) as input and return the lower triangular matrix \(L\), the upper triangular matrix \(U\) and a vector \(p\) containing the pivot elements.

function plufact(A)
    n = size(A,1)
    L = zeros(n,n)
    U = zeros(n,n)
    p = fill(0,n)
    Aₖ = float(copy(A))

    # Reduction by outer products
    for k in 1:n-1
        p[k] = argmax(abs.(Aₖ[:,k])) # find largest element in column k and store in p[k]
        U[k,:] = Aₖ[p[k],:]
        L[:,k] = Aₖ[:,k]/U[k,k]
        Aₖ -= L[:,k]*U[k,:]'
    end
    p[n] = argmax(abs.(Aₖ[:,n]))
    U[n,n] = Aₖ[p[n],n]
    L[:,n] = Aₖ[:,n]/U[n,n]
    return LowerTriangular(L[p,:]),U,p
end

plufact (generic function with 1 method)

L, U, p = plufact(A)

([1.0 0.0 0.0 0.0; -0.25 1.0 0.0 0.0; 0.5 -0.15384615384615385 1.0 0.0; -0.5 0.15384615384615385 0.08333333333333334 1.0], [-4.0 5.0 -7.0 -10.0; 0.0 16.25 0.25 -7.0; 0.0 0.0 5.538461538461538 -9.076923076923077; 0.0 0.0 0.0 -0.1666666666666664], [4, 3, 2, 1])

To solve for a given vector \(b\) we can now use the pivot LU factorization of \(A\) by also changing the order of the elements in the vector \(b\) accordingly:

b = rand(4)
z = forwardsub(L,b[p])
x = backsub(U,z)

4-element Vector{Float64}:
 18.063489021884052
 -1.4579225290873112
 -6.093923798139574
 -3.7338758276315955

b - A*x

4-element Vector{Float64}:
  2.1094237467877974e-15
 -1.2212453270876722e-15
  8.881784197001252e-16
 -3.1086244689504383e-15

4.4 LU Factorization

Contents

4.4 LU Factorization#

Pivoting#