Skip to main content
Logo image

Applied Discrete Structures

Section 12.4 The Diagonalization Process

Subsection 12.4.1 Eigenvalues and Eigenvectors

We now have the background to understand the main ideas behind the diagonalization process.

Definition 12.4.1. Eigenvalue, Eigenvector.

Let A be an n×n matrix over R. λ is an eigenvalue of A if for some nonzero column vector xRn we have Ax=λx. x is called an eigenvector corresponding to the eigenvalue λ.

Example 12.4.2. Examples of eigenvalues and eigenvectors.

Find the eigenvalues and corresponding eigenvectors of the matrix A=(2123).
We want to find nonzero vectors X=(x1x2) and real numbers λ such that
(12.4.1)AX=λX(2123)(x1x2)=λ(x1x2)(2123)(x1x2)λ(x1x2)=(00)(2123)(x1x2)λ(1001)(x1x2)=(00)((2123)λ(1001))(x1x2)=(00)(2λ123λ)(x1x2)=(00)
The last matrix equation will have nonzero solutions if and only if
det(2λ123λ)=0
or (2λ)(3λ)2=0, which simplifies to λ25λ+4=0. Therefore, the solutions to this quadratic equation, λ1=1 and λ2=4, are the eigenvalues of A. We now have to find eigenvectors associated with each eigenvalue.
Case 1. For λ1=1, (12.4.1) becomes:
(211231)(x1x2)=(00)(1122)(x1x2)=(00) 
which reduces to the single equation, x1+x2=0. From this, x1=x2. This means the solution set of this equation is (in column notation)
E1={(cc)|cR}
So any column vector of the form (cc) where c is any nonzero real number is an eigenvector associated with λ1=1. The reader should verify that, for example,
(2123)(11)=1(11)
so that (11) is an eigenvector associated with eigenvalue 1.
Case 2. For λ2=4 (12.4.1) becomes:
(241234)(x1x2)=(00)(2121)(x1x2)=(00)
which reduces to the single equation 2x1+x2=0, so that x2=2x1. The solution set of the equation is
E2={(c2c)|cR}
Therefore, all eigenvectors of A associated with the eigenvalue λ2=4 are of the form (c2c), where c can be any nonzero number.
The following theorems summarize the most important aspects of the previous example.
The equation det(AλI)=0 is called the characteristic equation, and the left side of this equation is called the characteristic polynomial of A.
The solution space of (AλI)x=0 is called the eigenspace of A corresponding to λ. This terminology is justified by Exercise 2 of this section.

Subsection 12.4.2 Diagonalization

We now consider the main aim of this section. Given an n×n (square) matrix A, we would like to transform A into a diagonal matrix D, perform our tasks with the simpler matrix D, and then describe the results in terms of the given matrix A.

Definition 12.4.5. Diagonalizable Matrix.

An n×n matrix A is called diagonalizable if there exists an invertible n×n matrix P such that P1AP is a diagonal matrix D. The matrix P is said to diagonalize the matrix A.

Example 12.4.6. Diagonalization of a Matrix.

We will now diagonalize the matrix A of Example 12.4.2. We form the matrix P as follows: Let P(1) be the first column of P. Choose for P(1) any eigenvector from E1. We may as well choose a simple vector in E1 so P(1)=(11) is our candidate.
Similarly, let P(2) be the second column of P, and choose for P(2) any eigenvector from E2. The vector P(2)=(12) is a reasonable choice, thus
P=(1112) and P1=13(2111)=(23131313)
so that
P1AP=13(2111)(2123)(1112)=(1004)
Notice that the elements on the main diagonal of D are the eigenvalues of A, where Dii is the eigenvalue corresponding to the eigenvector P(i) .

Note 12.4.7.

  1. The first step in the diagonalization process is the determination of the eigenvalues. The ordering of the eigenvalues is purely arbitrary. If we designate λ1=4 and λ2=1, the columns of P would be interchanged and D would be (4001) (see Exercise 3b of this section). Nonetheless, the final outcome of the application to which we are applying the diagonalization process would be the same.
  2. If A is an n×n matrix with distinct eigenvalues, then P is also an n×n matrix whose columns P(1), P(2),, P(n) are n linearly independent vectors.

Example 12.4.8. Diagonalization of a 3 by 3 matrix.

Diagonalize the matrix
A=(11218011180610).
First, we find the eigenvalues of A.
det(AλI)=det(1λ12180λ11180610λ)=(1λ)det(λ1118610λ)=(1λ)((λ11)(10λ)+108)=(1λ)(λ2+λ2)
Hence, the equation det(AλI) becomes
(1λ)(λ2+λ2)=(λ1)2(λ+2)
Therefore, our eigenvalues for A are λ1=2 and λ2=1. We note that we do not have three distinct eigenvalues, but we proceed as in the previous example.
Case 1. For λ1=2 the equation (AλI)x=0 becomes
(3121809180612)(x1x2x3)=(000)
We can row reduce the matrix of coefficients to (102012000).
The matrix equation is then equivalent to the equations x1=2x3 and x2=2x3. Therefore, the solution set, or eigenspace, corresponding to λ1=2 consists of vectors of the form
(2x32x3x3)=x3(221)
Therefore (221) is an eigenvector corresponding to the eigenvalue λ1=2, and can be used for our first column of P:
P=(2??2??1??)
Before we continue we make the observation: E1 is a subspace of R3 with basis {P(1)} and dimE1=1.
Case 2. If λ2=1, then the equation (AλI)x=0 becomes
(0121801218069)(x1x2x3)=(000)
Without the aid of any computer technology, it should be clear that all three equations that correspond to this matrix equation are equivalent to 2x23x3=0, or x2=32x3. Notice that x1 can take on any value, so any vector of the form
(x132x3x3)=x1(100)+x3(0321)
will solve the matrix equation.
We note that the solution set contains two independent variables, x1 and x3. Further, note that we cannot express the eigenspace E2 as a linear combination of a single vector as in Case 1. However, it can be written as
E2={x1(100)+x3(0321)x1,x3R}.
We can replace any vector in a basis is with a nonzero multiple of that vector. Simply for aesthetic reasons, we will multiply the second vector that generates E2 by 2. Therefore, the eigenspace E2 is a subspace of R3 with basis {(100),(032)} and so dimE2=2.
What this means with respect to the diagonalization process is that λ2=1 gives us both Column 2 and Column 3 the diagonalizing matrix. The order is not important so we have
P=(210203102)
The reader can verify (see Exercise 5 of this section) that P1=(023146012) and P1AP=(200010001)
In doing Example 12.4.8, the given 3×3 matrix A produced only two, not three, distinct eigenvalues, yet we were still able to diagonalize A. The reason we were able to do so was because we were able to find three linearly independent eigenvectors. Again, the main idea is to produce a matrix P that does the diagonalizing. If A is an n×n matrix, P will be an n×n matrix, and its n columns must be linearly independent eigenvectors. The main question in the study of diagonalizability is “When can it be done?” This is summarized in the following theorem.

Proof.

Outline of a proof: () Assume that A has linearly independent eigenvectors, P(1),P(2),,P(n), with corresponding eigenvalues λ1, λ2, , λn. We want to prove that A is diagonalizable. Column i of the n×n matrix AP is AP(i) (see Exercise 7 of this section). Then, since the P(i) is an eigenvector of A associated with the eigenvalue λi we have AP(i)=λiP(i) for i=1,2,,n. But this means that AP=PD, where D is the diagonal matrix with diagonal entries λ1, λ2,, λn. If we multiply both sides of the equation by P1 we get the desired P1AP=D.
() The proof in this direction involves a concept that is not covered in this text (rank of a matrix); so we refer the interested reader to virtually any linear algebra text for a proof.
We now give an example of a matrix that is not diagonalizable.

Example 12.4.10. A Matrix that is Not Diagonalizable.

Let us attempt to diagonalize the matrix A=(100021114)
First, we determine the eigenvalues.
det(AλI)=det(1λ0002λ1114λ)=(1λ)det(2λ114λ)=(1λ)((2λ)(4λ)+1)=(1λ)(λ26λ+9)=(1λ)(λ3)2
Therefore there are two eigenvalues, λ1=1 and λ2=3. Since λ1 is an eigenvalue of degree one, it will have an eigenspace of dimension 1. Since λ2 is a double root of the characteristic equation, the dimension of its eigenspace must be 2 in order to be able to diagonalize.
Case 1. For λ1=1, the equation (AλI)x=0 becomes
(000011113)(x1x2x3)=(000)
Row reduction of this system reveals one free variable and eigenspace
(x1x2x3)=(4x3x3x3)=x3(411)
Hence, {(411)} is a basis for the eigenspace of λ1=1.
Case 2. For λ2=3, the equation (AλI)x=0 becomes
(200011111)(x1x2x3)=(000)
Once again there is only one free variable in the row reduction and so the dimension of the eigenspace will be one:
(x1x2x3)=(0x3x3)=x3(011)
Hence, {(011)} is a basis for the eigenspace of λ2=3. This means that λ2=3 produces only one column for P. Since we began with only two eigenvalues, we had hoped that λ2=3 would produce a vector space of dimension two, or, in matrix terms, two linearly independent columns for P. Since A does not have three linearly independent eigenvectors A cannot be diagonalized.

Subsection 12.4.3 SageMath Note - Diagonalization

We demonstrate how diagonalization can be done in Sage. We start by defining the matrix to be diagonalized, and also declare D and P to be variables.
We have been working with “right eigenvectors” since the x in Ax=λx is a column vector. It’s not so common but still desirable in some situations to consider “left eigenvectors,” so SageMath allows either one. The right_eigenmatrix method returns a pair of matrices. The diagonal matrix, D, with eigenvalues and the diagonalizing matrix, P, which is made up of columns that are eigenvectors corresponding to the eigenvectors of D.
We should note here that P is not unique because even if an eigenspace has dimension one, any nonzero vector in that space will serve as an eigenvector. For that reason, the P generated by Sage isn’t necessarily the same as the one computed by any other computer algebra system such as Mathematica. Here we verify the result for our Sage calculation. Recall that an asterisk is used for matrix multiplication in Sage.
Here is a second matrix to diagonalize.
Here we’ve already specified that the underlying system is the rational numbers. Since the eigenvalues are not rational, Sage will revert to approximate number by default. We’ll just pull out the matrix of eigenvectors this time and display rounded entries.
Finally, we examine how Sage reacts to the matrix from Example 12.4.10 that couldn’t be diagonalized. Notice that the last column is a zero column, indicating the absence of one needed eigenvector.

Exercises 12.4.4 Exercises

1.

  1. List three different eigenvectors of A=(2123), the matrix of Example 12.4.2, associated with each of the two eigenvalues 1 and 4. Verify your results.
  2. Choose one of the three eigenvectors corresponding to 1 and one of the three eigenvectors corresponding to 4, and show that the two chosen vectors are linearly independent.
Answer.
  1. Any nonzero multiple of (11) is an eigenvector associated with λ=1.
  2. Any nonzero multiple of (12) is an eigenvector associated with λ=4.
  3. Let x1=(aa) and x2=(b2b) . You can verify that c1x1+c2x2=(00) if and only if c1=c2=0. Therefore, {x1,x2} is linearly independent.

2.

  1. Verify that E1 and E2 in Example 12.4.2 are vector spaces over R. Since they are also subsets of R2, they are called subvector-spaces, or subspaces for short, of R2. Since these are subspaces consisting of eigenvectors, they are called eigenspaces.
  2. Use the definition of dimension in the previous section to find dimE1 and dimE2 . Note that dimE1+dimE2=dimR2. This is not a coincidence.

3.

  1. Verify that P1AP is indeed equal to (1004), as indicated in Example 12.4.6.
  2. Choose P(1)=(12) and P(2)=(11) and verify that the new value of P satisfies P1AP=(4001).
  3. Take two different (from the previous part) linearly independent eigenvectors of the matrix A of Example 12.4.6 and verify that P1AP is a diagonal matrix.
Answer.
Part c: You should obtain (4001) or (1004), depending on how you order the eigenvalues.

4.

  1. Let A be the matrix in Example 12.4.8 and P=(010101102). Without doing any actual matrix multiplications, determine the value of P1AP
  2. If you choose the columns of P in the reverse order, what is P1AP?

5.

Diagonalize the following, if possible:
  1. (1232)
  2. (2176)
  3. (3004)
  4. (114321211)
  5. (600074913)
  6. (110121011)
Answer.
  1. If P=(2131), then P1AP=(4001).
  2. If P=(1171), then P1AP=(5001).
  3. If P=(1001), then P1AP=(3004).
  4. If P=(111142111), then P1AP=(200010000).
  5. A is not diagonalizable. Five is a double root of the characteristic equation, but has an eigenspace with dimension only 1.
  6. If P=(111201111), then P1AP=(300010000).

6.

Diagonalize the following, if possible:
  1. (0111)
  2. (2142)
  3. (2110)
  4. (136356336)
  5. (110101011)
  6. (210121012)

7.

Let A and P be as in Example 12.4.8. Show that the columns of the matrix AP can be found by computing AP(1), AP(2),, AP(n).
Answer.
This is a direct application of the definition of matrix multiplication. Let A(i) be the ith row of A, and let P(j) be the jth column of P. Then the jth column of the product AP is
(A(1)P(j)A(2)P(j)A(n)P(j))
Hence, (AP)(j)=A(P(j)) for j=1,2,,n. Thus, each column of AP depends on A and the j th column of P.

8.

Prove that if P is an n×n matrix and D is a diagonal matrix with diagonal entries d1, d2,, dn, then PD is the matrix obtained from P, by multiplying column i of P by di, i=1,2,,n.
You have attempted of activities on this page.