Worksheet: Singular Value Decomposition

For this worksheet, the reader is directed to Section 4.6. Further details may be found in Section 8.6 of Linear Algebra with Applications, by Keith Nicholson. (See also notebook by Dr. Juan H Klopper ¹.)

🔗

In Section 4.6 we saw that the singular_value_decomposition algorithm in SymPy does things a little bit differently than in Section 4.6. If we start with a square matrix

A,

the results are the same, but if

A

is not square, the decomposition

A = P Σ_{A} Q^{T}

looks a little different. In particular, if

A

is

m \times n,

the matrix

Σ_{A}

defined in Section 4.6 will also be

m \times n,

but it will contain some rows or columns of zeros that are added to get the desired size.

🔗

The matrix

Q

is an orthogonal

n \times n

matrix whose columns are an orthonormal basis of eigenvectors for

A^{T} A .

The matrix

P

is an orthogonal

m \times m

matrix whose columns are an orthonormal basis of

R^{m} .

(The first

r

columns of

P

are given by

A q_{i},

where

q_{i}

is the eigenvector of

A^{T} A

corresponding to the positive singular value

σ_{i} .

)

🔗

The algorithm provided by SymPy replaces

Σ_{A}

by the

r \times r

diagonal matrix of nonzero singular values. (This is common in most algorithms, since we don’t want to bother storing data we don’t need.) The matrix

Q

is replaced by the

n \times r

matrix whose columns are the first

r

eigenvectors of

A^{T} A,

and the matrix

P

is replaced by the

m \times r

matrix whose columns are the orthonormal basis for the column space of

A .

(Note that the rank of

A^{T} A

is equal to the rank of

A,

which is equal to the number

r

of nonzero eigenvectors of

A^{T} A

(counted with multiplicity).)

🔗

The product

P Σ_{A} Q^{T}

will be the same in both cases, and the matrix

P

is the same as well.

🔗

This time, rather than using the SymPy algorithm, we will work through the process outlined in Section 4.6 step-by-step. Let’s revisit Example 4.6.4. Let

A = [\begin{matrix} 1 & 1 & 1 \\ 1 & 0 & - 1 \end{matrix}] .

First, we get the singular values:


    
        
xxxxxxxxxx
 
1
from sympy import Matrix,init_printing
2
init_printing()
3
A = Matrix([[1,1,1],[1,0,-1]])
4
L0=A.singular_values()
5
L0

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

Next, we get the eigenvalues and eigenvectors of

A^{T} A :


    
        
xxxxxxxxxx
 
1
B = (A.T)*A
2
L1=B.eigenvects()
3
L1

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

Now we need to normalize the eigenvectors, in the correct order. Note that the eigenvectors were listed in increasing order of eigenvalue, so we need to reverse the order. Note that L1 is a list of lists. The eigenvector is the third entry (index 2) in the list (eigenvalue, multiplicity, eigenvector). We also need to turn list elements into matrices. So, for example the second eigenvector is Matrix(L1[1][2]).


    
        
xxxxxxxxxx
 
1
R1=Matrix(L1[2][2])
2
R2=Matrix(L1[1][2])
3
R3=Matrix(L1[0][2])
4
Q1 = (1/R1.norm())*R1
5
Q2 = (1/R2.norm())*R2
6
Q3 = (1/R3.norm())*R3
7
Q1,Q2,Q3

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

Next, we can assemble these vectors into a matrix, and confirm that it’s orthogonal.


    
        
xxxxxxxxxx
 
1
from sympy import BlockMatrix
2
Q = Matrix(BlockMatrix([Q1,Q2,Q3]))
3
Q,Q*Q.T

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

We’ve made the matrix

Q!

Next, we construct

Σ_{A} .

This we will do by hand. (Can you think of a way to do it automatically?)


    
        
xxxxxxxxxx
 
1
SigA = Matrix([[L0[0],0,0],[0,L0[1],0]])
2
SigA

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

Alternatively, you could do SigA = diag(L0[0],L0[1]).row_join(Matrix([0,0])). Finally, we need to make the matrix

P .

First, we find the vectors

A q_{1}, A q_{2}

and normalize. (Note that

A q_{3} = 0,

so this vector is unneeded, as expected.)


    
        
xxxxxxxxxx
 
1
S1 = A*Q1
2
S2 = A*Q2
3
P1 = (1/S1.norm())*S1
4
P2 = (1/S2.norm())*S2
5
P = Matrix(BlockMatrix([P1,P2]))
6
P

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

Note that the matrix

P

is already the correct size, because

rank (A) = 2 \dim (R^{2}) .

In general, for an

m \times n

matrix

A,

if

rank (A) = r < m,

we would have to extend the set

{p_{1}, \dots, p_{r}}

to a basis for

R^{m} .

Finally, we check that our matrices work as advertised.


    
        
xxxxxxxxxx
 
1
P*SigA*(Q.T)

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

For convenience, here is all of the above code, with all print commands (except the last one) removed.

from sympy import Matrix,BlockMatrix,init_printing
init_printing()
A = Matrix([[1,1,1],[1,0,-1]])
B=(A.T)*A
L0=A.singular_values()
L1=B.eigenvects()
R1=Matrix(L1[2][2])
R2=Matrix(L1[1][2])
R3=Matrix(L1[0][2])
Q1 = (1/R1.norm())*R1
Q2 = (1/R2.norm())*R2
Q3 = (1/R3.norm())*R3
Q = Matrix(BlockMatrix([Q1,Q2,Q3]))
SigA = diag(L0[0],L0[1]).row_join(Matrix([0,0]))
S1 = A*Q1
S2 = A*Q2
P1 = (1/S1.norm())*S1
P2 = (1/S2.norm())*S2
P = Matrix(BlockMatrix([P1,P2]))
P,SigA,Q,P*SigA*Q.T

🔗

1.

🔗

Compute the SVD for the matrices

[\begin{matrix} 2 & - 1 & 1 \\ 1 & 0 & - 2 \end{matrix}] [\begin{matrix} 2 & - 1 \\ - 1 & 3 \\ 1 & - 1 \end{matrix}] [\begin{matrix} 1 & 1 & 0 \\ 0 & 1 & 2 \\ 1 & 0 & - 2 \end{matrix}] .

🔗

Note that for these matrices, you may need to do some additional work to extend the

p_{i}

vectors to an orthonormal basis. You can adapt the code above, but you will have to think about how to implement additional code to construct any extra vectors you need.

🔗

2.

🔗

By making some very minor changes in the matrices in Worksheet Exercise 4.7.1, convince yourself that (a) those matrices were chosen very carefully, and (b) there’s a reason why most people do SVD numerically.

🔗

3.

🔗

Recall from Worksheet 3.5 that for an inconsistent system

A x = b,

we wish to find a vector

y

so that

A x = y

is consistent, with

y

as close to

b

as possible.

🔗

In other words, we want to minimize

‖ A x - b ‖,

or equivialently,

‖ A x - b ‖^{2} .

🔗

(a)

🔗

Let

A = P Σ_{A} Q^{T}

be the singular value decomposition of

A .

Show that

‖ A x - b ‖ = ‖ Σ_{A} y - z ‖,

🔗

where

y = Q^{T} x,

and

z = P^{T} b .

🔗

(b)

🔗

Show that setting

y_{i} = {\begin{cases} z_{i} / σ_{i}, & if σ_{i} \neq 0 \\ 0, & if σ_{i} = 0 \end{cases}

minimizes the value of

‖ Σ_{A} y - z ‖ .

🔗

(c)

🔗

Recall that we set

Σ_{A} = [\begin{matrix} D_{A} & 0 \\ 0 & 0 \end{matrix}],

where

D_{A}

is the diagonal matrix of nonzero singular values. Let us define the pseudo-inverse of

Σ_{A}

to be the matrix

Σ_{A}^{+} = [\begin{matrix} D_{A}^{- 1} & 0 \\ 0 & 0 \end{matrix}] .

🔗

Show that the solution to the least squares problem is given by

x = A^{+} b,

where

A^{+} = Q Σ_{A}^{+} P^{T} .

Linear Algebra: A second course, featuring proofs and Python

Worksheet 4.7 Worksheet: Singular Value Decomposition
A4 US

1.

2.

3.

(a)

(b)

(c)

Worksheet 4.7 Worksheet: Singular Value Decomposition A4US

1.

2.

3.

(a)

(b)

(c)

Worksheet 4.7 Worksheet: Singular Value Decomposition
A4 US