Section 4.4 Diagonalization of complex matrices
Recall that when we first defined vector spaces, we mentioned that a vector space can be defined over any field To keep things simple, we’ve mostly assumed But most of the theorems and proofs we’ve encountered go through unchanged if we work over a general field. (This is not quite true: over a finite field things can get more complicated. For example, if then we get weird results like since )
In fact, if we replace by about the only thing we’d have to go back and change is the definition of the dot product. The reason for this is that although the complex numbers seem computationally more complicated, (which might mostly be because you don’t use them often enough) they follow the exact same algebraic rules as the real numbers. In other words, the arithmetic might be different, but the algebra is the same. There is one key difference between the two fields: over the complex numbers, every polynomial can be factored. This is important if you’re interested in finding eigenvalues.
This section is written based on the assumption that complex numbers were covered in a previous course. If this was not the case, or to review this material, see Appendix A before proceeding.
Subsection 4.4.1 Complex vectors
A complex vector space is simply a vector space where the scalars are elements of rather than Examples include polynomials with complex coefficients, complex-valued functions, and which is defined exactly how you think it should be. In fact, one way to obtain is to start with the exact same standard basis we use for and then take linear combinations using complex scalars.
The standard inner product on looks a lot like the dot product on with one important difference: we apply a complex conjugate to the second vector.
If are real, this is just the usual dot product. The reason for using the complex conjugate is to ensure that we still have a positive-definite inner product on
Exercise 4.4.2.
This isn’t hard to do by hand, but it’s useful to know how to ask the computer to do it, too. Unfortunately, the dot product in SymPy does not include the complex conjugate. One likely reason for this is that while most mathematicians take the complex conjugate of the second vector, some mathematicians, and most physicists, put the conjugate on the first vector. So they may have decided to remain agnostic about this choice. We can manually apply the conjugate, using
Z.dot(W.H)
. (The .H
operation is the hermitian conjugate; see Definition 4.4.6 below.)xxxxxxxxxx
from sympy import Matrix,init_printing
init_printing()
Z = Matrix(3,1,[2-I,3*I,4+2*I])
W = Matrix(3,1,[3*I,4-5*I,-2+2*I])
Z, W, Z.dot(W.H)
Again, you might want to wrap that last term in for the dot product). Above, we saw that the complex inner product is designed to be positive definite, like the real inner product. The remaining properties of the complex inner product are given as follows.
simplify()
(in which case you’ll get Theorem 4.4.3.
Proof.
- Using the distributive properties of matrix multiplication and the transpose,The proof is similar when addition is in the second component. (But not identical -- you’ll need the fact that the complex conjugate is distributive, rather than the transpose.)
- These again follow from writing the inner product as a matrix product.and
- Note that for any vectors
is a number, and therefore equal to its own transpose. Thus, we have and - This was already demonstrated above.
Definition 4.4.4.
Exercise 4.4.5.
True.
- Since the norm is computed using the modulus, which is always real and non-negative, the norm will be a real number as well. If you ever get a complex number for your norm, you’ve probably forgotten the complex conjugate somewhere.
False.
- Since the norm is computed using the modulus, which is always real and non-negative, the norm will be a real number as well. If you ever get a complex number for your norm, you’ve probably forgotten the complex conjugate somewhere.
The norm of a complex vector is always a real number.
Subsection 4.4.2 Complex matrices
Linear transformations are defined in exactly the same way, and a complex matrix is simply a matrix whose entries are complex numbers. There are two important operations defined on complex matrices: the conjugate, and the conjugate transpose (also known as the hermitian transpose).
Definition 4.4.6.
Note that many textbooks use the notation for the conjugate transpose.
Definition 4.4.7.
Hermitian and unitary matrices (or more accurately, linear operators) are very important in quantum mechanics. Indeed, hermitian matrices represent “observable” quantities, in part because their eigenvalues are real, as we’ll soon see. For us, hermitian and unitary matrices can simply be viewed as the complex counterparts of symmetric and orthogonal matrices, respectively. In fact, a real symmetric matrix is hermitian, since the conjugate has no effect on it, and similarly, a real orthogonal matrix is technically unitary. As with orthogonal matrices, a unitary matrix can also be characterized by the property that its rows and columns both form orthonormal bases.
Exercise 4.4.8.
When using SymPy, the hermitian conjugate of a matrix
A
is executed using A.H
. (There appears to also be an equivalent operation named Dagger
coming from sympy.physics.quantum
, but I’ve had more success with .H
.) The complex unit is entered as I
. So for the exercise above, we can do the following.xxxxxxxxxx
A = Matrix(3,3,[4,1-I,-2+3*I,1+I,5,7*I,-2-3*I,-7*I,-4])
A == A.H
The last line verifies that We could also replace it with is unitary.
A,A.H
to explicitly see the two matrices side by side. Now, let’s confirm that xxxxxxxxxx
B = Matrix(2,2,[1/2+1/2*I, sqrt(2)/2,1/2-1/2*I,(sqrt(2)/2)*I])
B,B*B.H
Hmm... That doesn’t look like the identity on the right. Maybe try replacing
B*B.H
with simplify(B*B.H)
. (You will want to add from sympy import simplify
at the top of the cell.) Or you could try B.H, B**-1
to compare results. Actually, what’s interesting is that in a Sage cell, B.H == B**-1
yields False
, but B.H == simplify(B**-1)
yields True
!As mentioned above, hermitian matrices are the complex analogue of symmetric matrices. Recall that a key property of a symmetric matrix is its symmetry with respect to the dot product. For a symmetric matrix we had Hermtian matrices exhibit the same behaviour with respect to the complex inner product.
Theorem 4.4.9.
Proof.
Note that the property is equivalent to This gives us
Conversely, suppose for all and let denote the standard basis for Then
which shows that
Next, we’ve noted that one advantage of doing linear algebra over is that every polynomial can be completely factored, including the characteristic polynomial. This means that we can always find eigenvalues for a matrix. When that matrix is hermitian, we get a surprising result.
Theorem 4.4.10.
Proof.
- Suppose
for some and ThenThus, and since we must have which means - Similarly, suppose
are eigenvalues of with corresponding eigenvectors ThenThis gives us And since we already know must be real, and we must have
In light of Theorem 4.4.10, we realize that diagonalization of hermitian matrices will follow the same script as for symmetric matrices. Indeed, Gram-Schmidt Orthonormalization Algorithm applies equally well in as long as we replace the dot product with the complex inner product. This suggests the following.
Theorem 4.4.11. Spectral Theorem.
If is an hermitian matrix, then there exists an orthonormal basis of consisting of eigenvectors of Moreover, the matrix whose columns consist of those eigenvectors is unitary, and the matrix is diagonal.
Exercise 4.4.12.
Confirm that the matrix is hermitian. Then, find the eigenvalues of and a unitary matrix such that is diagonal.
To do the above exercise using SymPy, we first define and ask for the eigenvectors.
xxxxxxxxxx
A = Matrix(2,2,[4,3-I,3+I,1])
A.eigenvects()
We can now manually determine the matrix as we did above, and input it:
xxxxxxxxxx
U = Matrix([[(3-I)/sqrt(35),(3-I)/sqrt(14)],
[-5/sqrt(35),2/sqrt(14)]])
To confirm it’s unitary, add the line really is diagonal, go back to the cell above, and enter it. Try
U*U.H
to the above, and confirm that you get the identity matrix as output. You might need to use simplify(U*U.H)
if the result is not clear. Now, to confirm that (U.H)*A*U
, just to remind yourself that adding the simplify
command is often a good idea.If you want to cut down on the manual labour involved, we can make use of some of the other tools SymPy provides. In the next cell, we’re going to assign the output of
A.eigenvects()
to a list. The only trouble is that the output of the eigenvector command is a list of lists. Each list item is a list (eigenvalue, multiplicity, [eigenvectors])
.xxxxxxxxxx
L = A.eigenvects()
L
Try the above modifications, in sequence. First, replacing the second line by
L[0]
will give the first list item, which is another list:We want the third item in the list, so try
(L[0])[2]
. But note the extra set of brackets! There could (in theory) be more than one eigenvector, so this is a list with one item. To finally get the vector out, try ((L[0])[2])[0]
. (There is probably a better way to do this. Someone who is more fluent in Python is welcome to advise.)Now that we know how to extract the eigenvectors, we can normalize them, and join them to make a matrix. The norm of a vector is simnply
v.norm()
, and to join column vectors u1
and u2
to make a matrix, we can use the command u1.row_join(u2)
. We already defined the matrix A
and list L
above, but here is the whole routine in one cell, in case you didn’t run all the cells above.xxxxxxxxxx
from sympy import Matrix, init_printing, simplify
init_printing()
A = Matrix(2,2,[4,3-I,3+I,1])
L = A.eigenvects()
v = ((L[0])[2])[0]
w = ((L[1])[2])[0]
u1 = (1/v.norm())*v
u2 = (1/w.norm())*w
U = u1.row_join(u2)
u1, u2, U, simplify(U.H*A*U)
Believe me, you want the simplify command on that last matrix.
While Theorem 4.4.11 guarantees that any hermitian matrix can be “unitarily diagonalized”, there are also non-hermitian matrices for which this can be done as well. A classic example of this is the rotation matrix This is a real matrix with complex eigenvalues and while it is neither symmetric nor hermitian, it can be orthogonally diagonalized. This should be contrasted with the real spectral theorem, where any matrix that can be orthogonally diagonalized is necessarily symmetric.
This suggests that perhaps hermitian matrices are not quite the correct class of matrix for which the spectral theorem should be stated. Indeed, it turns out there is a somewhat more general class of matrix: the normal matrices.
Definition 4.4.13.
Exercise 4.4.14.
- Select all matrices below that are normal.
- This matrix is hermitian, and we know that every hermitian matrix is normal.
- This matrix is not normal; this can be confirmed by direct computation, or by noting that it cannot be diagonalized.
- This matrix is unitary, and every unitary matrix is normal.
- This matrix is neither hermitian nor unitary, but it is normal, which can be verified by direct computation.
It turns out that a matrix is normal if and only if for some unitary matrix and diagonal matrix A further generalization is known as Schur’s Theorem.
Theorem 4.4.15.
For any complex matrix there exists a unitary matrix such that is upper-triangular, and such that the diagonal entries of are the eigenvalues of
Using Schur’s Theorem, we can obtain a famous result, known as the Cayley-Hamilton Theorem, for the case of complex matrices. (It is true for real matrices as well, but we don’t yet have the tools to prove it.) The Cayley-Hamilton Theorem states that substituting any matrix into its characteristic polynomial results in the zero matrix. To understand this result, we should first explain how to define a polynomial of a matrix.
(Note the presence of the identity matrix in the first term, since it does not make sense to add a scalar to a matrix.) Note further that since for any invertible matrix and positive integer we have for any polynomial and unitary matrix
Theorem 4.4.16.
Proof.
By Theorem 4.4.15, there exists a unitary matrix such that where is upper triangular, and has the eigenvalues of as diagonal entries. Since and (since and are similar) it suffices to show that when is upper-triangular. (If you like, we are showing that and deducing that ) But if is upper-triangular, so is and therefore, is just the product of the diagonal entries. That is,
so
Since the first column of is the first column of is identically zero. The second column of similarly has the form for some number
It follows that the first two columns of are identically zero. Since only the first two entries in the third column of can be nonzero, we find that the first three columns of are zero, and so on.
Exercises 4.4.3 Exercises
1.
Suppose is a matrix with real entries that has a complex eigenvalue with corresponding eigenvector Find another eigenvalue and eigenvector for
2.
Give an example of a [`2 times 2 `] matrix with no real eigenvalues.
3.
4.
5.
Let Find formulas for the entries of where is a positive integer. (Your formulas should not contain complex numbers.)
6.
7.
You have attempted 1 of 11 activities on this page.