FCLA Matrix Multiplication

Section MM Matrix Multiplication

We know how to add vectors and how to multiply them by scalars. Together, these operations give us the possibility of making linear combinations. Similarly, we know how to add matrices and how to multiply matrices by scalars. In this section we mix all these ideas together and produce an operation known as matrix multiplication. This will lead to some results that are both surprising and central. We begin with a definition of how to multiply a vector by a matrix.

🔗

Subsection MVP Matrix-Vector Product

We have repeatedly seen the importance of forming linear combinations of the columns of a matrix. As one example of this, the oft-used Theorem SLSLC, said that every solution to a system of linear equations gives rise to a linear combination of the column vectors of the coefficient matrix that equals the vector of constants. This theorem, and others, motivate the following central definition.

🔗

Definition MVP. Matrix-Vector Product.

Suppose

A

is an

m \times n

matrix with columns

A_{1}, A_{2}, A_{3}, \dots, A_{n}

and

u

is a vector of size

n .

Then the matrix-vector product of

A

with

u

is the linear combination

🔗

A u = {[u]}_{1} A_{1} + {[u]}_{2} A_{2} + {[u]}_{3} A_{3} + \dots + {[u]}_{n} A_{n} .

🔗

So, the matrix-vector product is yet another version of “multiplication,” at least in the sense that we have yet again overloaded juxtaposition of two symbols as our notation. Remember your objects, an

m \times n

matrix times a vector of size

n

will create a vector of size

m .

So if

A

is rectangular, then the size of the “output” vector is different from the size of the “input” vector. With all the linear combinations we have performed so far, this computation should now seem second nature.

🔗

Example MTV. A matrix times a vector.

Consider

\begin{aligned} A = [\begin{array}{c} 1 & 4 & 2 & 3 & 4 \\ - 3 & 2 & 0 & 1 & - 2 \\ 1 & 6 & - 3 & - 1 & 5 \end{array}] & u = [\begin{array}{c} 2 \\ 1 \\ - 2 \\ 3 \\ - 1 \end{array}] . \end{aligned}

Then

A u = 2 [\begin{matrix} 1 \\ - 3 \\ 1 \end{matrix}] + 1 [\begin{matrix} 4 \\ 2 \\ 6 \end{matrix}] + (- 2) [\begin{matrix} 2 \\ 0 \\ - 3 \end{matrix}] + 3 [\begin{matrix} 3 \\ 1 \\ - 1 \end{matrix}] + (- 1) [\begin{matrix} 4 \\ - 2 \\ 5 \end{matrix}] = [\begin{matrix} 7 \\ 1 \\ 6 \end{matrix}] .

We can now represent systems of linear equations compactly with a matrix-vector product (Definition MVP) and column vector equality (Definition CVE). This finally yields a very popular alternative to our unconventional

LS (A, b)

notation.

🔗

Theorem SLEMM. Systems of Linear Equations as Matrix Multiplication.

The set of solutions to the linear system

LS (A, b)

equals the set of solutions for

x

in the vector equation

A x = b .

🔗

Proof.

This theorem says that two sets (of solutions) are equal. So we need to show that one set of solutions is a subset of the other, and vice versa (Definition SE). Let

A_{1}, A_{2}, A_{3}, \dots, A_{n}

be the columns of

A .

Both of these set inclusions then follow from the following chain of equivalences (Proof Technique E).

\begin{aligned} x is a solution to LS (A, b) \\ ⟺ {[x]}_{1} A_{1} + {[x]}_{2} A_{2} + {[x]}_{3} A_{3} + \dots + {[x]}_{n} A_{n} = b & Theorem SLSLC \\ ⟺ x is a solution to A x = b & Definition MVP \end{aligned}

Example MNSLE. Matrix notation for systems of linear equations.

Consider the system of linear equations from Example NSLE.

\begin{aligned} 2 x_{1} + 4 x_{2} - 3 x_{3} + 5 x_{4} + x_{5} & = 9 \\ 3 x_{1} + x_{2} + x_{4} - 3 x_{5} & = 0 \\ - 2 x_{1} + 7 x_{2} - 5 x_{3} + 2 x_{4} + 2 x_{5} & = - 3 \end{aligned}

has coefficient matrix and vector of constants

\begin{aligned} A & = [\begin{array}{c} 2 & 4 & - 3 & 5 & 1 \\ 3 & 1 & 0 & 1 & - 3 \\ - 2 & 7 & - 5 & 2 & 2 \end{array}] & b & = [\begin{array}{c} 9 \\ 0 \\ - 3 \end{array}] \end{aligned}

and so will be described compactly by the vector equation

A x = b .

The matrix-vector product is a very natural computation. We have motivated it by its connections with systems of equations, but here is another example.

🔗

Example MBC. Money’s best cities.

Every year Money magazine selects several cities in the United States as the “best” cities to live in, based on a wide array of statistics about each city. This is a small example of how the editors of Money might arrive at a single number that consolidates the statistics about a city. We will analyze Los Angeles, Chicago and New York City, based on four criteria: average high temperature in July (Farenheit), number of colleges and universities in a 30-mile radius, number of toxic waste sites in the Superfund environmental clean-up program and a personal crime index based on FBI statistics (average = 100, smaller is safer). It should be apparent how to generalize the example to a greater number of cities and a greater number of statistics.

We begin by building a table of statistics. The rows will be labeled with the cities, and the columns with statistical categories. These values are from Money’s website in early 2005.

City	Temp	Colleges	Superfund	Crime
Los Angeles	77	28	93	254
Chicago	84	38	85	363
New York	84	99	1	193

Conceivably these data might reside in a spreadsheet. Now we must combine the statistics for each city. We could accomplish this by weighting each category, scaling the values and summing them. The sizes of the weights would depend upon the numerical size of each statistic generally, but more importantly, they would reflect the editors opinions or beliefs about which statistics were most important to their readers. Is the crime index more important than the number of colleges and universities? Of course, there is no right answer to this question.

Suppose the editors finally decide on the following weights to employ: temperature,

0.23;

colleges,

0.46;

Superfund,

- 0.05;

crime,

- 0.20 .

Notice how negative weights are used for undesirable statistics. Then, for example, the editors would compute for Los Angeles

(0.23) (77) + (0.46) (28) + (- 0.05) (93) + (- 0.20) (254) = - 24.86

This computation might remind you of an inner product, but we will produce the computations for all of the cities as a matrix-vector product. Write the table of raw statistics as a matrix

T = [\begin{matrix} 77 & 28 & 93 & 254 \\ 84 & 38 & 85 & 363 \\ 84 & 99 & 1 & 193 \end{matrix}]

and the weights as a vector

w = [\begin{matrix} 0.23 \\ 0.46 \\ - 0.05 \\ - 0.20 \end{matrix}]

then the matrix-vector product (Definition MVP) yields

T w = (0.23) [\begin{matrix} 77 \\ 84 \\ 84 \end{matrix}] + (0.46) [\begin{matrix} 28 \\ 38 \\ 99 \end{matrix}] + (- 0.05) [\begin{matrix} 93 \\ 85 \\ 1 \end{matrix}] + (- 0.20) [\begin{matrix} 254 \\ 363 \\ 193 \end{matrix}] = [\begin{matrix} - 24.86 \\ - 40.05 \\ 26.21 \end{matrix}] .

This vector contains a single number for each of the cities being studied, so the editors would rank New York best (

26.21

), Los Angeles next (

- 24.86

), and Chicago third (

- 40.05

). Of course, the mayor’s offices in Chicago and Los Angeles are free to counter with a different set of weights that cause their city to be ranked best. These alternative weights would be chosen to play to each cities’ strengths, and minimize their problem areas.

Notice how the vector of weights,

w,

is naturally indexed by the four statistics used, while the matrix-vector product,

T w,

is naturally indexed by the three cities. If a speadsheet were used to make these computations, a row of weights would be entered somewhere near the table of data and the formulas in the spreadsheet would effect a matrix-vector product. This example is meant to illustrate how “linear” computations (addition, multiplication) can be organized as a matrix-vector product.

Another example would be the matrix of numerical scores on examinations and exercises for students in a class. The rows would be indexed by students and the columns would be indexed by exams and assignments. The instructor could then assign weights to the different exams and assignments, and via a matrix-vector product, compute a single score for each student.

Later (much later) we will need the following theorem, which is really a technical lemma (see Proof Technique LC). Since we are in a position to prove it now, we will. But you can safely skip it for the moment, if you promise to come back later to study the proof when the theorem is employed. At that point you will also be able to understand the comments in the paragraph following the proof.

🔗

Theorem EMMVP. Equal Matrices and Matrix-Vector Products.

Suppose that

A

and

B

are

m \times n

matrices such that

A x = B x

for every

x \in C^{n} .

Then

A = B .

🔗

Proof.

We are assuming

A x = B x

for all

x \in C^{n},

so we can employ this equality for any choice of the vector

x .

However, we will limit our use of this equality to the standard unit vectors,

e_{j},

1 \leq j \leq n

(Definition SUV). For all

1 \leq j \leq n,

1 \leq i \leq m,

\begin{aligned} {[A]}_{i j} \\ = 0 {[A]}_{i 1} + \dots + 0 {[A]}_{i, j - 1} + 1 {[A]}_{i j} + 0 {[A]}_{i, j + 1} + \dots + 0 {[A]}_{i n} & Theorem PCNA \\ = {[e_{j}]}_{1} {[A]}_{i 1} + {[e_{j}]}_{2} {[A]}_{i 2} + {[e_{j}]}_{3} {[A]}_{i 3} + \dots + {[e_{j}]}_{n} {[A]}_{i n} & Property CMCN \\ = {[A]}_{i 1} {[e_{j}]}_{1} + {[A]}_{i 2} {[e_{j}]}_{2} + {[A]}_{i 3} {[e_{j}]}_{3} + \dots + {[A]}_{i n} {[e_{j}]}_{n} & Definition SUV \\ = {[A e_{j}]}_{i} & Definition MVP \\ = {[B e_{j}]}_{i} & Hypothesis \\ = {[B]}_{i 1} {[e_{j}]}_{1} + {[B]}_{i 2} {[e_{j}]}_{2} + {[B]}_{i 3} {[e_{j}]}_{3} + \dots + {[B]}_{i n} {[e_{j}]}_{n} & Definition MVP \\ = {[e_{j}]}_{1} {[B]}_{i 1} + {[e_{j}]}_{2} {[B]}_{i 2} + {[e_{j}]}_{3} {[B]}_{i 3} + \dots + {[e_{j}]}_{n} {[B]}_{i n} & Property CMCN \\ = 0 {[B]}_{i 1} + \dots + 0 {[B]}_{i, j - 1} + 1 {[B]}_{i j} + 0 {[B]}_{i, j + 1} + \dots + 0 {[B]}_{i n} & Definition SUV \\ = {[B]}_{i j} & Theorem PCNA . \end{aligned}

So by Definition ME the matrices

A

and

B

are equal, as desired.

You might notice from studying the proof that the hypotheses of this theorem could be “weakened” (i.e. made less restrictive). We need only suppose the equality of the matrix-vector products for just the standard unit vectors (Definition SUV) or any other spanning set (Definition SSVS) of

C^{n}

(Exercise LISS.T40). However, in practice, when we apply this theorem the stronger hypothesis will be in effect so this version of the theorem will suffice for our purposes. (If we changed the statement of the theorem to have the less restrictive hypothesis, then we would call the theorem “stronger.”)

🔗

Sage MVP. Matrix-Vector Product.

A matrix-vector product is very natural in Sage, and we can check the result against a linear combination of the columns.

xxxxxxxxxx
 
A = matrix(QQ, [[1, -3,  4,  5],
                [2,  3, -2,  0],
                [5,  6,  8, -2]])
v = vector(QQ, [2, -2, 1, 3])
A*v

xxxxxxxxxx
 
sum([v[i]*A.column(i) for i in range(len(v))])

Notice that when a matrix is square, a vector of the correct size can be used in Sage in a product with a matrix by placing the vector on either side of the matrix. However, the two results are not the same, and we will not have ocassion to place the vector on the left any time soon. So, despite the possibility, be certain to keep your vectors on the right side of a matrix in a product.

xxxxxxxxxx
 
B = matrix(QQ, [[ 1, -3,  4,  5],
                [ 2,  3, -2,  0],
                [ 5,  6,  8, -2],
                [-4,  1,  1,  2]])
w = vector(QQ, [1, 2, -3, 2])
B*w

xxxxxxxxxx
 
w*B

xxxxxxxxxx
 
B*w == w*B

Since a matrix-vector product forms a linear combination of columns of a matrix, it is now very easy to check if a vector is a solution to a system of equations. This is basically the substance of Theorem SLEMM. Here we construct a system of equations and construct two solutions and one non-solution by applying Theorem PSPHS. Then we use a matrix-vector product to verify that the vectors are, or are not, solutions.

xxxxxxxxxx
 
coeff = matrix(QQ, [[-1,  3, -1, -1,  0,  2],
                    [ 2, -6,  1, -2, -5, -8],
                    [ 1, -3,  2,  5,  4,  1],
                    [ 2, -6,  2,  2,  1, -3]])
const = vector(QQ, [13, -25, -17, -23])
solution1 = coeff.solve_right(const)
coeff*solution1

xxxxxxxxxx
 
nsp = coeff.right_kernel(basis='pivot')
nsp

xxxxxxxxxx
 
nspb = nsp.basis()
solution2 = solution1 + 5*nspb[0]+(-4)*nspb[1]+2*nspb[2]
coeff*solution2

xxxxxxxxxx
 
nonnullspace = vector(QQ, [5, 0, 0, 0, 0, 0])
nonnullspace in nsp

xxxxxxxxxx
 
nonsolution = solution1 + nonnullspace
coeff*nonsolution

We can now explain the difference between “left” and “right” variants of various Sage commands for linear algebra. Generally, the direction refers to where the vector is placed in a matrix-vector product. We place a vector on the right and understand this to mean a linear combination of the columns of the matrix. Placing a vector to the left of a matrix can be understood, in a manner totally consistent with our upcoming definition of matrix multiplication, as a linear combination of the rows of the matrix.

So the difference between A.solve_right(v) and A.solve_left(v) is that the former asks for a vector x such that A*x == v, while the latter asks for a vector x such that x*A == v. Given Sage’s preference for rows, a direction-neutral version of a command, if it exists, will be the “left” version. For example, there is a .right_kernel() matrix method, while the .left_kernel() and .kernel() methods are identical — the names are synonyms for the exact same routine.

So when you see a Sage command that comes in “left” and “right” variants figure out just what part of the defined object involves a matrix-vector product and form an interpretation from that.

🔗

Subsection MM Matrix Multiplication

We now define how to multiply two matrices together. Stop for a minute and think about how you might define this new operation.

🔗

Many books would present this definition much earlier in the course. However, we have taken great care to delay it as long as possible and to present as many ideas as practical based mostly on the notion of linear combinations. Towards the conclusion of the course, or when you perhaps take a second course in linear algebra, you may be in a position to appreciate the reasons for this. For now, understand that matrix multiplication is a central definition and perhaps you will appreciate its importance more by having saved it for later.

🔗

Definition MM. Matrix Multiplication.

Suppose

A

is an

m \times n

matrix and

B_{1}, B_{2}, B_{3}, \dots, B_{p}

are the columns of an

n \times p

matrix

B .

Then the matrix product of

A

with

B

is the

m \times p

matrix where column

i

is the matrix-vector product

A B_{i} .

Symbolically

🔗

A B = A [B_{1} | B_{2} | B_{3} | \dots | B_{p}] = [A B_{1} | A B_{2} | A B_{3} | \dots | A B_{p}] .

🔗

Example PTM. Product of two matrices.

Set

\begin{aligned} A = [\begin{array}{c} 1 & 2 & - 1 & 4 & 6 \\ 0 & - 4 & 1 & 2 & 3 \\ - 5 & 1 & 2 & - 3 & 4 \end{array}] & B = [\begin{array}{c} 1 & 6 & 2 & 1 \\ - 1 & 4 & 3 & 2 \\ 1 & 1 & 2 & 3 \\ 6 & 4 & - 1 & 2 \\ 1 & - 2 & 3 & 0 \end{array}] & . \end{aligned}

Then

A B = [A [\begin{matrix} 1 \\ - 1 \\ 1 \\ 6 \\ 1 \end{matrix}] | A [\begin{matrix} 6 \\ 4 \\ 1 \\ 4 \\ - 2 \end{matrix}] | A [\begin{matrix} 2 \\ 3 \\ 2 \\ - 1 \\ 3 \end{matrix}] | A [\begin{matrix} 1 \\ 2 \\ 3 \\ 2 \\ 0 \end{matrix}]] = [\begin{matrix} 28 & 17 & 20 & 10 \\ 20 & - 13 & - 3 & - 1 \\ - 18 & - 44 & 12 & - 3 \end{matrix}] .

Is this the definition of matrix multiplication you expected? Perhaps our previous operations for matrices caused you to think that we might multiply two matrices of the same size, entry-by-entry? Notice that our current definition uses matrices of different sizes (though the number of columns in the first must equal the number of rows in the second), and the result is of a third size. Notice too in the previous example that we cannot even consider the product

B A,

since the sizes of the two matrices in this order are not right.

🔗

But it gets weirder than that. Many of your old ideas about “multiplication” will not apply to matrix multiplication, but some still will. So make no assumptions, and do not do anything until you have a theorem that says you can. Even if the sizes are right, matrix multiplication is not commutative — order matters.

🔗

Example MMNC. Matrix multiplication is not commutative.

Set

\begin{aligned} A = [\begin{array}{c} 1 & 3 \\ - 1 & 2 \end{array}] & B = [\begin{array}{c} 4 & 0 \\ 5 & 1 \end{array}] . \end{aligned}

Then we have two square,

2 \times 2

matrices, so Definition MM allows us to multiply them in either order. We find

\begin{aligned} A B = [\begin{array}{c} 19 & 3 \\ 6 & 2 \end{array}] & B A = [\begin{array}{c} 4 & 12 \\ 4 & 17 \end{array}] \end{aligned}

A B \neq B A .

Not even close. It should not be hard for you to construct other pairs of matrices that do not commute (try a couple of

3 \times 3

’s). Can you find a pair of non-identical matrices that do commute?

🔗

Subsection MMEE Matrix Multiplication, Entry-by-Entry

While certain “natural” properties of multiplication do not hold, many more do. In the next subsection, we will state and prove the relevant theorems. But first, we need a theorem that provides an alternate means of multiplying two matrices. In many texts, this would be given as the definition of matrix multiplication. We prefer to turn it around and have the following formula as a consequence of our definition. It will prove useful for proofs of matrix equality, where we need to examine products of matrices, entry-by-entry.

🔗

Theorem EMP. Entries of Matrix Products.

Suppose

A

is an

m \times n

matrix and

B

is an

n \times p

matrix. Then for

1 \leq i \leq m,

1 \leq j \leq p,

the individual entries of

A B

are given by

🔗

\begin{aligned} {[A B]}_{i j} & = {[A]}_{i 1} {[B]}_{1 j} + {[A]}_{i 2} {[B]}_{2 j} + {[A]}_{i 3} {[B]}_{3 j} + \dots + {[A]}_{i n} {[B]}_{n j} \\ = \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} . \end{aligned}

🔗

Proof.

Let the vectors

A_{1}, A_{2}, A_{3}, \dots, A_{n}

denote the columns of

A

and let the vectors

B_{1}, B_{2}, B_{3}, \dots, B_{p}

denote the columns of

B .

Then for

1 \leq i \leq m,

1 \leq j \leq p,

\begin{aligned} {[A B]}_{i j} & = {[A B_{j}]}_{i} & Definition MM \\ = {[{[B_{j}]}_{1} A_{1} + {[B_{j}]}_{2} A_{2} + \dots + {[B_{j}]}_{n} A_{n}]}_{i} & Definition MVP \\ = {[{[B_{j}]}_{1} A_{1}]}_{i} + {[{[B_{j}]}_{2} A_{2}]}_{i} + \dots + {[{[B_{j}]}_{n} A_{n}]}_{i} & Definition CVA \\ = {[B_{j}]}_{1} {[A_{1}]}_{i} + {[B_{j}]}_{2} {[A_{2}]}_{i} + \dots + {[B_{j}]}_{n} {[A_{n}]}_{i} & Definition CVSM \\ = {[B]}_{1 j} {[A]}_{i 1} + {[B]}_{2 j} {[A]}_{i 2} + \dots + {[B]}_{n j} {[A]}_{i n} & Definition M \\ = {[A]}_{i 1} {[B]}_{1 j} + {[A]}_{i 2} {[B]}_{2 j} + \dots + {[A]}_{i n} {[B]}_{n j} & Property CMCN \\ = \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} \end{aligned}

Example PTMEE. Product of two matrices, entry-by-entry.

Consider again the two matrices from Example PTM.

\begin{aligned} A = [\begin{array}{c} 1 & 2 & - 1 & 4 & 6 \\ 0 & - 4 & 1 & 2 & 3 \\ - 5 & 1 & 2 & - 3 & 4 \end{array}] & B = [\begin{array}{c} 1 & 6 & 2 & 1 \\ - 1 & 4 & 3 & 2 \\ 1 & 1 & 2 & 3 \\ 6 & 4 & - 1 & 2 \\ 1 & - 2 & 3 & 0 \end{array}] \end{aligned}

Then suppose we just wanted the entry of

A B

in the second row, third column.

\begin{aligned} {[A B]}_{23} = & {[A]}_{21} {[B]}_{13} + {[A]}_{22} {[B]}_{23} + {[A]}_{23} {[B]}_{33} + {[A]}_{24} {[B]}_{43} + {[A]}_{25} {[B]}_{53} \\ = & (0) (2) + (- 4) (3) + (1) (2) + (2) (- 1) + (3) (3) = - 3 \end{aligned}

Notice how there are 5 terms in the sum, since 5 is the common dimension of the two matrices (column count for

A,

row count for

B

). In the conclusion of Theorem EMP, it would be the index

k

that would run from 1 to 5 in this computation. Here is a bit more practice.

The entry of third row, first column.

\begin{aligned} {[A B]}_{31} = & {[A]}_{31} {[B]}_{11} + {[A]}_{32} {[B]}_{21} + {[A]}_{33} {[B]}_{31} + {[A]}_{34} {[B]}_{41} + {[A]}_{35} {[B]}_{51} \\ = & (- 5) (1) + (1) (- 1) + (2) (1) + (- 3) (6) + (4) (1) = - 18 \end{aligned}

To get some more practice on your own, complete the computation of the other 10 entries of this product. Construct some other pairs of matrices (of compatible sizes) and compute their product two ways. First use Definition MM. Since linear combinations are straightforward for you now, this should be easy to do and to do correctly. Then do it again, using Theorem EMP. Since this process may take some practice, use your first computation to check your work.

Theorem EMP is the way many people compute matrix products by hand. It will also be very useful for the theorems we are going to prove shortly. However, the definition (Definition MM) is frequently the most useful for its connections with deeper ideas like the null space and the upcoming column space.

🔗

Sage MM. Matrix Multiplication.

Matrix multiplication is very natural in Sage, and is just as easy as multiplying two numbers. We illustrate Theorem EMP by using it to compute the entry in the first row and third column.

xxxxxxxxxx
 
A = matrix(QQ, [[3, -1, 2,  5],
                [9,  1, 2, -4]])
B = matrix(QQ, [[1,  6, 1],
                [0, -1, 2],
                [5,  2, 3],
                [1,  1, 1]])
A*B

xxxxxxxxxx
 
sum([A[0,k]*B[k,2] for k in range(A.ncols())])

Note in the final statement, we could replace A.ncols() by B.nrows() since these two quantities must be identical. You can experiment with the last statement by editing it to compute any of the five other entries of the matrix product.

Square matrices can be multiplied in either order, but the result will almost always be different. Execute repeatedly the following products of two random

4 \times 4

matrices, with a check on the equality of the two products in either order. It is possible, but highly unlikely, that the two products will be equal. So if this compute cell ever produces True it will be a minor miracle.

xxxxxxxxxx
 
A = random_matrix(QQ,4,4)
B = random_matrix(QQ,4,4)
A*B == B*A       # random, sort of

🔗

Subsection PMM Properties of Matrix Multiplication

In this subsection, we collect properties of matrix multiplication and its interaction with the zero matrix (Definition ZM), the identity matrix (Definition IM), matrix addition (Definition MA), scalar matrix multiplication (Definition MSM), the inner product (Definition IP), conjugation (Theorem MMCC), and the transpose (Definition TM). Whew! Here we go. These are great proofs to practice with, so try to concoct the proofs before reading them, they will get progressively more complicated as we go.

🔗

Theorem MMZM. Matrix Multiplication and the Zero Matrix.

Suppose

A

is an

m \times n

matrix. Then

🔗

$A O_{n \times p} = O_{m \times p} .$
$O_{p \times m} A = O_{p \times n} .$

🔗

Proof.

We will prove (1) and leave (2) to you. Entry-by-entry, for

1 \leq i \leq m,

1 \leq j \leq p,

\begin{aligned} {[A O_{n \times p}]}_{i j} & = \sum_{k = 1}^{n} {[A]}_{i k} {[O_{n \times p}]}_{k j} & Theorem EMP \\ = \sum_{k = 1}^{n} {[A]}_{i k} 0 & Definition ZM \\ = \sum_{k = 1}^{n} 0 \\ = 0 & Property ZCN \\ = {[O_{m \times p}]}_{i j} & Definition ZM . \end{aligned}

So by the definition of matrix equality (Definition ME), the matrices

A O_{n \times p}

and

O_{m \times p}

are equal.

Theorem MMIM. Matrix Multiplication and Identity Matrix.

Suppose

A

is an

m \times n

matrix. Then

🔗

$A I_{n} = A$
$I_{m} A = A$

🔗

Proof.

Again, we will prove (1) and leave (2) to you. Entry-by-entry, For

1 \leq i \leq m,

1 \leq j \leq n,

\begin{aligned} {[A I_{n}]}_{i j} = & \sum_{k = 1}^{n} {[A]}_{i k} {[I_{n}]}_{k j} & Theorem EMP \\ = {[A]}_{i j} {[I_{n}]}_{j j} + \sum_{\begin{array}{c} k = 1 \\ k \neq j \end{array}}^{n} {[A]}_{i k} {[I_{n}]}_{k j} & Property CACN \\ = {[A]}_{i j} (1) + \sum_{k = 1, k \neq j}^{n} {[A]}_{i k} (0) & Definition IM \\ = {[A]}_{i j} + \sum_{k = 1, k \neq j}^{n} 0 \\ = {[A]}_{i j} \end{aligned}

So the matrices

A

and

A I_{n}

are equal, entry-by-entry, and by the definition of matrix equality (Definition ME) we can say they are equal matrices.

It is this theorem that gives the identity matrix its name. It is a matrix that behaves with matrix multiplication like the scalar

1

does with scalar multiplication. To multiply by the identity matrix is to have no effect on the other matrix.

🔗

Theorem MMDAA. Matrix Multiplication Distributes Across Addition.

Suppose

A

is an

m \times n

matrix and

B

and

C

are

n \times p

matrices and

D

is a

p \times s

matrix. Then

🔗

$A (B + C) = A B + A C .$
$(B + C) D = B D + C D .$

🔗

Proof.

We will do (1), you do (2). Entry-by-entry, for

1 \leq i \leq m,

1 \leq j \leq p,

\begin{aligned} {[A (B + C)]}_{i j} & = \sum_{k = 1}^{n} {[A]}_{i k} {[B + C]}_{k j} & Theorem EMP \\ = \sum_{k = 1}^{n} {[A]}_{i k} ({[B]}_{k j} + {[C]}_{k j}) & Definition MA \\ = \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} + {[A]}_{i k} {[C]}_{k j} & Property DCN \\ = \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} + \sum_{k = 1}^{n} {[A]}_{i k} {[C]}_{k j} & Property CACN \\ = {[A B]}_{i j} + {[A C]}_{i j} & Theorem EMP \\ = {[A B + A C]}_{i j} & Definition MA \end{aligned}

So the matrices

A (B + C)

and

A B + A C

are equal, entry-by-entry, and by the definition of matrix equality (Definition ME) we can say they are equal matrices.

Theorem MMSMM. Matrix Multiplication and Scalar Matrix Multiplication.

Suppose

A

is an

m \times n

matrix and

B

is an

n \times p

matrix. Let

α

be a scalar. Then

α (A B) = (α A) B = A (α B) .

🔗

Proof.

These are equalities of matrices. We will do the first one, the second is similar and will be good practice for you. For

1 \leq i \leq m,

1 \leq j \leq p,

\begin{aligned} {[α (A B)]}_{i j} & = α {[A B]}_{i j} & Definition MSM \\ = α \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} & Theorem EMP \\ = \sum_{k = 1}^{n} α {[A]}_{i k} {[B]}_{k j} & Property DCN \\ = \sum_{k = 1}^{n} {[α A]}_{i k} {[B]}_{k j} & Definition MSM \\ = {[(α A) B]}_{i j} & Theorem EMP . \end{aligned}

So the matrices

α (A B)

and

(α A) B

are equal, entry-by-entry, and by the definition of matrix equality (Definition ME) we can say they are equal matrices.

If you want to test your facility creating proofs about matrix multiplication using Theorem EMP, the next proof is a good one to attempt yourself. So give it a try before reading the proof.

🔗

Theorem MMA. Matrix Multiplication is Associative.

Suppose

A

is an

m \times n

matrix,

B

is an

n \times p

matrix and

D

is a

p \times s

matrix. Then

A (B D) = (A B) D .

🔗

Proof.

A matrix equality, so we will go entry-by-entry, no surprise there. For

1 \leq i \leq m,

1 \leq j \leq s,

\begin{aligned} {[A (B D)]}_{i j} & = \sum_{k = 1}^{n} {[A]}_{i k} {[B D]}_{k j} & Theorem EMP \\ = \sum_{k = 1}^{n} {[A]}_{i k} (\sum_{ℓ = 1}^{p} {[B]}_{k ℓ} {[D]}_{ℓ j}) & Theorem EMP \\ = \sum_{k = 1}^{n} \sum_{ℓ = 1}^{p} {[A]}_{i k} {[B]}_{k ℓ} {[D]}_{ℓ j} & Property DCN \end{aligned}

We can switch the order of the summation since these are finite sums,

\begin{aligned} = \sum_{ℓ = 1}^{p} \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k ℓ} {[D]}_{ℓ j} & Property CACN \end{aligned}

{[D]}_{ℓ j}

does not depend on the index

k,

we can use distributivity to move it outside of the inner sum,

\begin{aligned} = \sum_{ℓ = 1}^{p} {[D]}_{ℓ j} (\sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k ℓ}) & Property DCN \\ = \sum_{ℓ = 1}^{p} {[D]}_{ℓ j} {[A B]}_{i ℓ} & Theorem EMP \\ = \sum_{ℓ = 1}^{p} {[A B]}_{i ℓ} {[D]}_{ℓ j} & Property CMCN \\ = {[(A B) D]}_{i j} & Theorem EMP . \end{aligned}

So the matrices

(A B) D

and

A (B D)

are equal, entry-by-entry, and by the definition of matrix equality (Definition ME) we can say they are equal matrices.

Since Theorem MMA says matrix multiplication is associative, it means we do not have to be careful about the order in which we perform matrix multiplication, nor how we parenthesize an expression with just several matrices multiplied togther. So this is where we draw the line on explaining every last detail in a proof. We will frequently add, remove, or rearrange parentheses with no comment. Indeed, I only see about a dozen places where Theorem MMA is cited in a proof. You could try to count how many times we avoid making a reference to this theorem.

🔗

It could be obvious already that properties of upper triangular matrices will have analogues for lower triangular matrices. Rather than stating two very similar theorems each time, we will economize and say that matrices are “triangular of the same type” as a convenient shorthand to cover both possibilities and then typically give a proof for just one type. Like the proof of Theorem MMA, the proof of the next theorem is another good one to test your facility with Theorem EMP as a tool for constructing proofs about matrix multiplication. Just by considering sizes we can see that the product of two square matrices of the same size is again a square matrix of that size. For triangular matrices it is even better.

🔗

Theorem PTMT. Product of Triangular Matrices is Triangular.

Suppose that

A

and

B

are matrices that are triangular of the same type. Then

A B

is also triangular of that type.

🔗

Proof.

We prove this for lower triangular matrices and leave the proof for upper triangular matrices to you. Suppose that

A

and

B

are both lower triangular of size

n .

We need only establish that certain entries of the product

A B

are zero. Suppose that

i < j,

then

\begin{aligned} {[A B]}_{i j} & = \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} & Theorem EMP \\ = \sum_{k = 1}^{j - 1} {[A]}_{i k} {[B]}_{k j} + \sum_{k = j}^{n} {[A]}_{i k} {[B]}_{k j} & Property AACN \\ = \sum_{k = 1}^{j - 1} {[A]}_{i k} 0 + \sum_{k = j}^{n} {[A]}_{i k} {[B]}_{k j} & k < j, Definition LTM \\ = \sum_{k = 1}^{j - 1} {[A]}_{i k} 0 + \sum_{k = j}^{n} 0 {[B]}_{k j} & i < j \leq k, Definition LTM \\ = \sum_{k = 1}^{j - 1} 0 + \sum_{k = j}^{n} 0 \\ = 0 . \end{aligned}

Since

{[A B]}_{i j} = 0

whenever

i < j,

by Definition LTM,

A B

is lower triangular.

The statement of our next theorem is technically inaccurate. If we upgrade the vectors

u, v

to matrices with a single column, then the expression

{\overset{―}{u}}^{t} v

is a

1 \times 1

matrix, though we will treat this small matrix as if it was simply the scalar quantity in its lone entry. When we apply Theorem MMIP there should not be any confusion. Notice that if we treat a column vector as a matrix with a single column, then we can also construct the adjoint of a vector, though we will not make this a common practice.

🔗

Theorem MMIP. Matrix Multiplication and Inner Products.

If we consider the vectors

u, v \in C^{m}

m \times 1

matrices then

⟨ u, v ⟩ = {\overset{―}{u}}^{t} v = u^{*} v .

🔗

Proof.

\begin{aligned} ⟨ u, v ⟩ & = \sum_{k = 1}^{m} \overset{―}{{[u]}_{k}} {[v]}_{k} & Definition IP \\ = \sum_{k = 1}^{m} \overset{―}{{[u]}_{k 1}} {[v]}_{k 1} & Column vectors as matrices \\ = \sum_{k = 1}^{m} {[\overset{―}{u}]}_{k 1} {[v]}_{k 1} & Definition CCM \\ = \sum_{k = 1}^{m} {[{\overset{―}{u}}^{t}]}_{1 k} {[v]}_{k 1} & Definition TM \\ = {[{\overset{―}{u}}^{t} v]}_{11} & Theorem EMP \end{aligned}

To finish we just blur the distinction between a

1 \times 1

matrix (

{\overset{―}{u}}^{t} v

) and its lone entry.

Theorem MMCC. Matrix Multiplication and Complex Conjugation.

Suppose

A

is an

m \times n

matrix and

B

is an

n \times p

matrix. Then

\overset{―}{A B} = \overset{―}{A} \overset{―}{B} .

🔗

Proof.

To obtain this matrix equality, we will work entry-by-entry. For

1 \leq i \leq m,

1 \leq j \leq p,

\begin{aligned} {[\overset{―}{A B}]}_{i j} & = \overset{―}{{[A B]}_{i j}} & Definition CCM \\ = \overset{―}{\sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j}} & Theorem EMP \\ = \sum_{k = 1}^{n} \overset{―}{{[A]}_{i k} {[B]}_{k j}} & Theorem CCRA \\ = \sum_{k = 1}^{n} \overset{―}{{[A]}_{i k}} \overset{―}{{[B]}_{k j}} & Theorem CCRM \\ = \sum_{k = 1}^{n} {[\overset{―}{A}]}_{i k} {[\overset{―}{B}]}_{k j} & Definition CCM \\ = {[\overset{―}{A} \overset{―}{B}]}_{i j} & Theorem EMP . \end{aligned}

So the matrices

\overset{―}{A B}

and

\overset{―}{A} \overset{―}{B}

are equal, entry-by-entry, and by the definition of matrix equality (Definition ME) we can say they are equal matrices.

Another theorem in this style, and it is a good one. If you have been practicing with the previous proofs you should be able to do this one yourself.

🔗

Theorem MMT. Matrix Multiplication and Transposes.

Suppose

A

is an

m \times n

matrix and

B

is an

n \times p

matrix. Then

(A B)^{t} = B^{t} A^{t} .

🔗

Proof.

This theorem may be surprising but if we check the sizes of the matrices involved, then maybe it will not seem so far-fetched. First,

A B

has size

m \times p,

so its transpose has size

p \times m .

The product of

B^{t}

with

A^{t}

is a

p \times n

matrix times an

n \times m

matrix, also resulting in a

p \times m

matrix. So at least our objects are compatible for equality (and would not be, in general, if we did not reverse the order of the matrix multiplication).

Here we go again, entry-by-entry. For

1 \leq i \leq m,

1 \leq j \leq p,

\begin{aligned} {[(A B)^{t}]}_{j i} = & {[A B]}_{i j} & Definition TM \\ = \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} & Theorem EMP \\ = \sum_{k = 1}^{n} {[B]}_{k j} {[A]}_{i k} & Property CMCN \\ = \sum_{k = 1}^{n} {[B^{t}]}_{j k} {[A^{t}]}_{k i} & Definition TM \\ = {[B^{t} A^{t}]}_{j i} & Theorem EMP . \end{aligned}

So the matrices

(A B)^{t}

and

B^{t} A^{t}

are equal, entry-by-entry, and by the definition of matrix equality (Definition ME) we can say they are equal matrices.

This theorem seems odd at first glance, since we have to switch the order of

A

and

B .

But if we simply consider the sizes of the matrices involved, we can see that the switch is necessary for this reason alone. That the individual entries of the products then come along to be equal is a bonus.

🔗

As the adjoint of a matrix is a composition of a conjugate and a transpose, its interaction with matrix multiplication is similar to that of a transpose. Here is the last of our long list of basic properties of matrix multiplication.

🔗

Theorem MMAD. Matrix Multiplication and Adjoints.

Suppose

A

is an

m \times n

matrix and

B

is an

n \times p

matrix. Then

(A B)^{*} = B^{*} A^{*} .

🔗

Proof.

\begin{aligned} (A B)^{*} & = {(\overset{―}{A B})}^{t} & Definition A \\ = {(\overset{―}{A} \overset{―}{B})}^{t} & Theorem MMCC \\ = {(\overset{―}{B})}^{t} {(\overset{―}{A})}^{t} & Theorem MMT \\ = B^{*} A^{*} & Definition A \end{aligned}

Notice how none of these proofs above relied on writing out huge general matrices with lots of ellipses (“…”) and trying to formulate the equalities a whole matrix at a time. This messy business is a “proof technique” to be avoided at all costs. Notice too how the proof of Theorem MMAD does not use an entry-by-entry approach, but simply builds on previous results about matrix multiplication’s interaction with conjugation and transposes.

🔗

These theorems, along with Theorem VSPM and the other results in Section MO, give you the “rules” for how matrices interact with the various operations we have defined on matrices (addition, scalar multiplication, matrix multiplication, conjugation, transposes and adjoints). Use them and use them often. But do not try to do anything with a matrix that you do not have a rule for. Together, we would informally call all these operations, and the attendant theorems, “the algebra of matrices.” Notice, too, that every column vector is just a

n \times 1

matrix, so these theorems apply to column vectors also. Finally, these results, taken as a whole, may make us feel that the definition of matrix multiplication is not so unnatural.

🔗

Sage PMM. Properties of Matrix Multiplication.

As before, we can use Sage to demonstrate theorems. With randomly-generated matrices, these verifications might be even more believable. Some of the above results should feel fairly routine, but some are perhaps contrary to intuition. For example, Theorem MMT might at first glance seem surprising due to the requirement that the order of the product is reversed. Here is how we would investigate this theorem in Sage. The following compute cell should always return True. Repeated experimental evidence does not make a proof, but certainly gives us confidence.

xxxxxxxxxx
 
A = random_matrix(QQ, 3, 7)
B = random_matrix(QQ, 7, 5)
(A*B).transpose() == B.transpose()*A.transpose()

By now, you can probably guess the matrix method for checking if a matrix is Hermitian.

xxxxxxxxxx
 
A = matrix(QQbar, [[     45, -5-12*I, -1-15*I, -56-8*I],
                   [-5+12*I,      42,    32*I, -14-8*I],
                   [-1+15*I,   -32*I,      57,    12+I],
                   [-56+8*I, -14+8*I,    12-I,      93]])
A.is_hermitian()

We can illustrate the most fundamental property of a Hermitian matrix. The vectors x and y below are random, but according to Theorem HMIP the final command should produce True for any possible values of these two vectors. (You would be right to think that using random vectors over QQbar would be a better idea, but at this writing, these vectors are not as “random” as one would like, and are insufficient to perform an accurate test here.)

xxxxxxxxxx
 
x = random_vector(QQ, 4) + QQbar(I)*random_vector(QQ, 4)
y = random_vector(QQ, 4) + QQbar(I)*random_vector(QQ, 4)
(A*x).hermitian_inner_product(y) == x.hermitian_inner_product(A*y)

🔗

Subsection HM Hermitian Matrices

The adjoint of a matrix has a basic property when employed in a matrix-vector product as part of an inner product. At this point, you could even use the following result as a motivation for the definition of an adjoint.

🔗

Theorem AIP. Adjoint and Inner Product.

Suppose that

A

is an

m \times n

matrix and

x \in C^{n},

y \in C^{m} .

Then

⟨ A x, y ⟩ = ⟨ x, A^{*} y ⟩ .

🔗

Proof.

\begin{aligned} ⟨ A x, y ⟩ & = {(\overset{―}{A x})}^{t} y & Theorem MMIP \\ = {(\overset{―}{A} \overset{―}{x})}^{t} y & Theorem MMCC \\ = {\overset{―}{x}}^{t} {\overset{―}{A}}^{t} y & Theorem MMT \\ = {\overset{―}{x}}^{t} (A^{*} y) & Definition A \\ = ⟨ x, A^{*} y ⟩ & Theorem MMIP \end{aligned}

Sometimes a matrix is equal to its adjoint (Definition A), and these matrices have interesting properties. One of the most common situations where this occurs is when a matrix has only real number entries. Then we are simply talking about symmetric matrices (Definition SYM), so you can view this as a generalization of a symmetric matrix.

🔗

Definition HM. Hermitian Matrix.

The square matrix

A

is Hermitian (or self-adjoint) if

A = A^{*} .

🔗

Again, the set of real matrices that are Hermitian is exactly the set of symmetric matrices. In Section PEE we will uncover some amazing properties of Hermitian matrices, so when you get there, run back here to remind yourself of this definition. Further properties will also appear in Section OD. Right now we prove a fundamental result about Hermitian matrices, matrix-vector products and inner products. As a characterization, this could be employed as a definition of a Hermitian matrix and some authors take this approach.

🔗

Theorem HMIP. Hermitian Matrices and Inner Products.

Suppose that

A

is a square matrix of size

n .

Then

A

is Hermitian if and only if

⟨ A x, y ⟩ = ⟨ x, A y ⟩

for all

x, y \in C^{n} .

🔗

Proof.

(⇒)

This is the “easy half” of the proof, and makes the rationale for a definition of Hermitian matrices most obvious. Assume

A

is Hermitian, then

\begin{aligned} ⟨ A x, y ⟩ & = ⟨ x, A^{*} y ⟩ & Theorem AIP \\ = ⟨ x, A y ⟩ & Definition HM . \end{aligned}

(⇐)

This “half” will take a bit more work. Assume that

⟨ A x, y ⟩ = ⟨ x, A y ⟩

for all

x, y \in C^{n} .

We show that

A = A^{*}

by establishing that

A x = A^{*} x

for all

x,

so we can then apply Theorem EMMVP. With only this much motivation, consider the inner product for any

x \in C^{n} .

\begin{aligned} ⟨ A x - A^{*} x, A x - A^{*} x ⟩ & = ⟨ A x - A^{*} x, A x ⟩ - ⟨ A x - A^{*} x, A^{*} x ⟩ & Theorem IPVA \\ = ⟨ A x - A^{*} x, A x ⟩ - ⟨ A (A x - A^{*} x), x ⟩ & Theorem AIP \\ = ⟨ A x - A^{*} x, A x ⟩ - ⟨ A x - A^{*} x, A x ⟩ & Hypothesis \\ = 0 & Property AICN \end{aligned}

Because this first inner product equals zero, and has the same vector in each argument (

A x - A^{*} x

), Theorem PIP gives the conclusion that

A x - A^{*} x = 0 .

With

A x = A^{*} x

for all

x \in C^{n},

Theorem EMMVP says

A = A^{*},

which is the defining property of a Hermitian matrix (Definition HM).

So, informally, Hermitian matrices are those that can be tossed around from one side of an inner product to the other with reckless abandon. We will see later what this buys us.

🔗

Reading Questions MM Reading Questions

1. Matrix-vector product.

Form the matrix-vector product of

🔗

\begin{aligned} [\begin{array}{c} 2 & 3 & - 1 & 0 \\ 1 & - 2 & 7 & 3 \\ 1 & 5 & 3 & 2 \end{array}] & with & [\begin{array}{c} 2 \\ - 3 \\ 0 \\ 5 \end{array}] . \end{aligned}

🔗

2. Calculate product of two matrices.

Multiply together the two matrices below (in the order given).

🔗

\begin{aligned} [\begin{array}{c} 2 & 3 & - 1 & 0 \\ 1 & - 2 & 7 & 3 \\ 1 & 5 & 3 & 2 \end{array}] & [\begin{array}{c} 2 & 6 \\ - 3 & - 4 \\ 0 & 2 \\ 3 & - 1 \end{array}] \end{aligned}

🔗

3. Matrix-vector product form of a linear system.

Rewrite the system of linear equations below as a vector equality and using a matrix-vector product. (This question does not ask for a solution to the system. But it does ask you to express the system of equations in a new form using tools from this section.)

🔗

\begin{aligned} 2 x_{1} + 3 x_{2} - x_{3} & = 0 \\ x_{1} + 2 x_{2} + x_{3} & = 3 \\ x_{1} + 3 x_{2} + 3 x_{3} & = 7 \end{aligned}

🔗

Exercises MM Exercises

C20.

Compute the product of the two matrices below,

A B .

Do this using the definitions of the matrix-vector product (Definition MVP) and the definition of matrix multiplication (Definition MM).

🔗

\begin{aligned} A = [\begin{array}{c} 2 & 5 \\ - 1 & 3 \\ 2 & - 2 \end{array}] & B = [\begin{array}{c} 1 & 5 & - 3 & 4 \\ 2 & 0 & 2 & - 3 \end{array}] \end{aligned}

🔗

Solution.

By Definition MM

\begin{aligned} A B & = [[\begin{array}{c} 2 & 5 \\ - 1 & 3 \\ 2 & - 2 \end{array}] [\begin{array}{c} 1 \\ 2 \end{array}] | [\begin{array}{c} 2 & 5 \\ - 1 & 3 \\ 2 & - 2 \end{array}] [\begin{array}{c} 5 \\ 0 \end{array}] | [\begin{array}{c} 2 & 5 \\ - 1 & 3 \\ 2 & - 2 \end{array}] [\begin{array}{c} - 3 \\ 2 \end{array}] | [\begin{array}{c} 2 & 5 \\ - 1 & 3 \\ 2 & - 2 \end{array}] [\begin{array}{c} 4 \\ - 3 \end{array}]] . \end{aligned}

Repeated applications of Definition MVP give

\begin{aligned} = [1 [\begin{array}{c} 2 \\ - 1 \\ 2 \end{array}] + 2 [\begin{array}{c} 5 \\ 3 \\ - 2 \end{array}] | 5 [\begin{array}{c} 2 \\ - 1 \\ 2 \end{array}] + 0 [\begin{array}{c} 5 \\ 3 \\ - 2 \end{array}] | - 3 [\begin{array}{c} 2 \\ - 1 \\ 2 \end{array}] + 2 [\begin{array}{c} 5 \\ 3 \\ - 2 \end{array}] | 4 [\begin{array}{c} 2 \\ - 1 \\ 2 \end{array}] + (- 3) [\begin{array}{c} 5 \\ 3 \\ - 2 \end{array}]] \\ = [\begin{array}{c} 12 & 10 & 4 & - 7 \\ 5 & - 5 & 9 & - 13 \\ - 2 & 10 & - 10 & 14 \end{array}] . \end{aligned}

🔗

C21.

Compute the product

A B

of the two matrices below using both the definition of the matrix-vector product (Definition MVP) and the definition of matrix multiplication (Definition MM).

🔗

\begin{aligned} A & = [\begin{array}{c} 1 & 3 & 2 \\ - 1 & 2 & 1 \\ 0 & 1 & 0 \end{array}] & B & = [\begin{array}{c} 4 & 1 & 2 \\ 1 & 0 & 1 \\ 3 & 1 & 5 \end{array}] \end{aligned}

🔗

Solution.

A B = [\begin{matrix} 13 & 3 & 15 \\ 1 & 0 & 5 \\ 1 & 0 & 1 \end{matrix}] .

🔗

C22.

Compute the product

A B

of the two matrices below using both the definition of the matrix-vector product (Definition MVP) and the definition of matrix multiplication (Definition MM).

🔗

\begin{aligned} A & = [\begin{array}{c} 1 & 0 \\ - 2 & 1 \end{array}] & B & = [\begin{array}{c} 2 & 3 \\ 4 & 6 \end{array}] \end{aligned}

🔗

Solution.

A B = [\begin{matrix} 2 & 3 \\ 0 & 0 \end{matrix}] .

🔗

C23.

Compute the product

A B

of the two matrices below using both the definition of the matrix-vector product (Definition MVP) and the definition of matrix multiplication (Definition MM).

🔗

\begin{aligned} A & = [\begin{array}{c} 3 & 1 \\ 2 & 4 \\ 6 & 5 \\ 1 & 2 \end{array}] & B & = [\begin{array}{c} - 3 & 1 \\ 4 & 2 \end{array}] \end{aligned}

🔗

Solution.

A B = [\begin{matrix} - 5 & 5 \\ 10 & 10 \\ 2 & 16 \\ 5 & 5 \end{matrix}] .

🔗

C24.

Compute the product

A B

of the two matrices below.

🔗

\begin{aligned} A & = [\begin{array}{c} 1 & 2 & 3 & - 2 \\ 0 & 1 & - 2 & - 1 \\ 1 & 1 & 3 & 1 \end{array}] & B & = [\begin{array}{c} 3 \\ 4 \\ 0 \\ 2 \end{array}] \end{aligned}

🔗

Solution.

A B = [\begin{matrix} 7 \\ 2 \\ 9 \end{matrix}] .

🔗

C25.

Compute the product

A B

of the two matrices below.

🔗

\begin{aligned} A & = [\begin{array}{c} 1 & 2 & 3 & - 2 \\ 0 & 1 & - 2 & - 1 \\ 1 & 1 & 3 & 1 \end{array}] & B & = [\begin{array}{c} - 7 \\ 3 \\ 1 \\ 1 \end{array}] \end{aligned}

🔗

Solution.

A B = [\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}] .

🔗

C26.

Compute the product

A B

of the two matrices below using both the definition of the matrix-vector product (Definition MVP) and the definition of matrix multiplication (Definition MM).

🔗

\begin{aligned} A & = [\begin{array}{c} 1 & 3 & 1 \\ 0 & 1 & 0 \\ 1 & 1 & 2 \end{array}] & B & = [\begin{array}{c} 2 & - 5 & - 1 \\ 0 & 1 & 0 \\ - 1 & 2 & 1 \end{array}] \end{aligned}

🔗

Solution.

A B = [\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] .

🔗

C30.

For the matrix

A,

find

A^{2},

A^{3},

A^{4} .

Find a general formula for

A^{n}

for any positive integer

n .

🔗

A = [\begin{matrix} 1 & 2 \\ 0 & 1 \end{matrix}]

🔗

Solution.

\begin{aligned} A^{2} & = [\begin{array}{c} 1 & 4 \\ 0 & 1 \end{array}] & A^{3} & = [\begin{array}{c} 1 & 6 \\ 0 & 1 \end{array}] & A^{4} & = [\begin{array}{c} 1 & 8 \\ 0 & 1 \end{array}] \end{aligned}

From this pattern, we see that

A^{n} = [\begin{matrix} 1 & 2 n \\ 0 & 1 \end{matrix}] .

🔗

C31.

For the matrix

A,

find

A^{2},

A^{3},

A^{4} .

Find a general formula for

A^{n}

for any positive integer

n .

🔗

A = [\begin{matrix} 1 & - 1 \\ 0 & 1 \end{matrix}]

🔗

Solution.

\begin{aligned} A^{2} & = [\begin{array}{c} 1 & - 2 \\ 0 & 1 \end{array}] & A^{3} & = [\begin{array}{c} 1 & - 3 \\ 0 & 1 \end{array}] & A^{4} & = [\begin{array}{c} 1 & - 4 \\ 0 & 1 \end{array}] \end{aligned}

From this pattern, we see that

A^{n} = [\begin{matrix} 1 & - n \\ 0 & 1 \end{matrix}] .

🔗

C32.

For the matrix

A,

find

A^{2},

A^{3},

A^{4} .

Find a general formula for

A^{n}

for any positive integer

n .

🔗

A = [\begin{matrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{matrix}]

🔗

Solution.

\begin{aligned} A^{2} & = [\begin{array}{c} 1 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 9 \end{array}] & A^{3} & = [\begin{array}{c} 1 & 0 & 0 \\ 0 & 8 & 0 \\ 0 & 0 & 27 \end{array}] & A^{4} & = [\begin{array}{c} 1 & 0 & 0 \\ 0 & 16 & 0 \\ 0 & 0 & 81 \end{array}] \end{aligned}

The pattern emerges, and we see that

A^{n} = [\begin{matrix} 1 & 0 & 0 \\ 0 & 2^{n} & 0 \\ 0 & 0 & 3^{n} \end{matrix}] .

🔗

C33.

For the matrix

A,

find

A^{2},

A^{3},

A^{4} .

Find a general formula for

A^{n}

for any positive integer

n .

🔗

A = [\begin{matrix} 0 & 1 & 2 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{matrix}]

🔗

Solution.

We quickly compute

\begin{aligned} A^{2} & = [\begin{array}{c} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array}] & A^{3} & = [\begin{array}{c} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array}] \end{aligned}

and we then see that all subsequent powers of

A

are the

3 \times 3

zero matrix. That is,

A^{n} = O

for

n \geq 3 .

🔗

T10.

Suppose that

A

is a square matrix and there is a vector,

b,

such that

LS (A, b)

has a unique solution. Prove that

A

is nonsingular. Give a direct proof (perhaps appealing to Theorem PSPHS) rather than just negating a sentence from the text discussing a similar situation.

🔗

Solution.

Since

LS (A, b)

has at least one solution, we can apply Theorem PSPHS. Because the solution is assumed to be unique, the null space of

A

must be trivial. Then Theorem NMTNS implies that

A

is nonsingular.

The converse of this statement is a trivial application of Theorem NMUS. That said, we could extend our NSMxx series of theorems with an added equivalence for nonsingularity, “Given a single vector of constants,

b,

the system

LS (A, b)

has a unique solution.”

🔗

T12.

The conclusion of Theorem AIP is

⟨ A x, y ⟩ = ⟨ x, A^{*} y ⟩ .

Use the same hypotheses, and prove the similar conclusion:

⟨ A^{*} x, y ⟩ = ⟨ x, A y ⟩ .

Two different approaches can both be based on an application of Theorem AIP. The first uses Theorem AA, while a second approach uses Theorem IPAC. Can you provide two proofs?

🔗

T20.

Prove the second part of Theorem MMZM.

🔗

T21.

Prove the second part of Theorem MMIM.

🔗

T22.

Prove the second part of Theorem MMDAA.

🔗

T23.

Prove the second part of Theorem MMSMM.

🔗

Solution.

We will run the proof entry-by-entry.

\begin{aligned} {[α (A B)]}_{i j} = & α {[A B]}_{i j} & Definition MSM \\ = & α \sum_{k = 1}^{n} {[A]}_{i k} {[B]}_{k j} & Theorem EMP \\ = & \sum_{k = 1}^{n} α {[A]}_{i k} {[B]}_{k j} & Property DCN \\ = & \sum_{k = 1}^{n} {[A]}_{i k} α {[B]}_{k j} & Property CMCN \\ = & \sum_{k = 1}^{n} {[A]}_{i k} {[α B]}_{k j} & Definition MSM \\ = & {[A (α B)]}_{i j} & Theorem EMP \end{aligned}

So the matrices

α (A B)

and

A (α B)

are equal, entry-by-entry, and by the definition of matrix equality (Definition ME) we can say they are equal matrices.

🔗

T28.

Suppose that

A

and

B

are both upper triangular matrices (Definition UTM) of size

n .

Use Theorem EMP to prove that each diagonal entry of the matrix product

A B

is the scalar product of the individual entries on the diagonal of each matrix. More precisely, show that

🔗

{[A B]}_{i i} = {[A]}_{i i} {[B]}_{i i}, 1 \leq i \leq n .

Note: the matrix product is only this simple for the diagonal entries, and only for the special case of triangular matrices!

🔗

T31.

Suppose that

A

is an

m \times n

matrix and

x, y \in N (A) .

Prove that

x + y \in N (A) .

🔗

T32.

Suppose that

A

is an

m \times n

matrix,

α \in C,

and

x \in N (A) .

Prove that

α x \in N (A) .

🔗

T35.

Suppose that

A

is an

n \times n

matrix. Prove that

A^{*} A

and

A A^{*}

are Hermitian matrices.

🔗

T38.

Theorem NMUS says that a linear system with a nonsingular coefficient matrix always has a unique solution. We will soon see exactly what that solution is in Theorem SNCM, but you do not need that result for this exercise. Use results from this section to give another proof that a solution to a linear system with a nonsingular coefficient matrix is unique.

🔗

Solution.

The statement lets us assume we have a solution to

LS (A, b)

and we are to show this solution is unique. Consult Proof Technique U.

Assume that we have two solutions, say

x_{1}

and

x_{2} .

Then, Theorem MMDAA and Theorem SLEMM allows us to write

A (x_{1} - x_{2}) = A x_{1} - A x_{2} = b - b = 0 .

Since

A

is nonsingular the only solution to

LS (A, 0)

is the zero vector and thus

x_{1} - x_{2} = 0

and so

x_{1} = x_{2} .

The two solutions are really the same.

🔗

T40.

Suppose that

A

is an

m \times n

matrix and

B

is an

n \times p

matrix. Prove that the null space of

B

is a subset of the null space of

A B,

that is

N (B) \subseteq N (A B) .

Provide an example where the opposite is false, in other words give an example where

N (A B) ⊈ N (B) .

🔗

Solution.

To prove that one set is a subset of another, we start with an element of the smaller set and see if we can determine that it is a member of the larger set (Definition SSET). Suppose

x \in N (B) .

Then we know that

B x = 0

by Definition NSM. Consider

\begin{aligned} (A B) x & = A (B x) & Theorem MMA \\ = A 0 & Hypothesis \\ = 0 & Theorem MMZM . \end{aligned}

This establishes that

x \in N (A B),

N (B) \subseteq N (A B) .

To show that the inclusion does not hold in the opposite direction, choose

B

to be any nonsingular matrix of size

n .

Then

N (B) = {0}

by Theorem NMTNS. Let

A

be the square zero matrix,

O,

of the same size. Then

A B = O B = O

by Theorem MMZM and therefore

N (A B) = C^{n},

and is not a subset of

N (B) = {0} .

🔗

T41.

Suppose that

A

is an

n \times n

nonsingular matrix and

B

is an

n \times p

matrix. Prove that the null space of

B

is equal to the null space of

A B,

that is

N (B) = N (A B) .

(Compare with Exercise MM.T40.)

🔗

Solution.

From the solution to Exercise MM.T40 we know that

N (B) \subseteq N (A B) .

So to establish the set equality (Definition SE) we need to show that

N (A B) \subseteq N (B) .

Suppose

x \in N (A B) .

Then we know that

A B x = 0

by Definition NSM. Consider

\begin{aligned} 0 & = (A B) x & Definition NSM \\ = A (B x) & Theorem MMA . \end{aligned}

So,

B x \in N (A) .

Because

A

is nonsingular, it has a trivial null space (Theorem NMTNS) and we conclude that

B x = 0 .

This establishes that

x \in N (B),

N (A B) \subseteq N (B)

and combined with the solution to Exercise MM.T40 we have

N (B) = N (A B)

when

A

is nonsingular.

🔗

T50.

Suppose

u

and

v

are any two solutions of the linear system

LS (A, b) .

Prove that

u - v

is an element of the null space of

A,

that is,

u - v \in N (A) .

🔗

T51.

Give a new proof of Theorem PSPHS replacing applications of Theorem SLSLC with matrix-vector products (Theorem SLEMM).

🔗

Solution.

We will work with the vector equality representations of the relevant systems of equations, as described by Theorem SLEMM.

Proof.

(⇐)

Suppose

y = w + z

and

z \in N (A) .

Then

\begin{aligned} A y & = A (w + z) & Substitution \\ = A w + A z & Theorem MMDAA \\ = b + 0 & z \in N (A) \\ = b & Property ZC \end{aligned}

demonstrating that

y

is a solution.

(⇒)

Suppose

y

is a solution to

LS (A, b) .

Then

\begin{aligned} A (y - w) & = A y - A w & Theorem MMDAA \\ = b - b & y, w solutions to A x = b \\ = 0 & Property AIC \end{aligned}

which says that

y - w \in N (A) .

In other words,

y - w = z

for some vector

z \in N (A) .

Rewritten, this is

y = w + z,

as desired.

🔗

T52.

Suppose that

x, y \in C^{n},

b \in C^{m}

and

A

is an

m \times n

matrix. If

x,

y

and

x + y

are each a solution to the linear system

LS (A, b),

what can you say that is interesting about

b ?

Form an implication with the existence of the three solutions as the hypothesis and an interesting statement about

LS (A, b)

as the conclusion, and then give a proof.

🔗

Solution.

LS (A, b)

must be homogeneous. To see this, consider that

\begin{aligned} b & = A x & Theorem SLEMM \\ = A x + 0 & Property ZC \\ = A x + A y - A y & Property AIC \\ = A (x + y) - A y & Theorem MMDAA \\ = b - b & Theorem SLEMM \\ = 0 & Property AIC . \end{aligned}

By Definition HS we see that

LS (A, b)

is homogeneous.

🔗

T80.

Suppose that

A

is a nonsingular matrix of size

n .

Let

T = {x_{1}, x_{2}, x_{3}, \dots, x_{m}} \subseteq C^{n}

be a linearly independent set of vectors. Prove that

🔗

R = {A x_{1}, A x_{2}, A x_{3}, \dots, A x_{m}} \subseteq C^{n}

is a linearly independent set.

🔗

Solution.

Begin with a relation of linear dependence on the set

R .

\begin{aligned} 0 & = a_{1} A x_{1} + a_{2} A x_{2} + a_{3} A x_{3} + \dots + a_{m} A x_{m} & Definition RLDCV \\ = A a_{1} x_{1} + A a_{2} x_{2} + A a_{3} x_{3} + \dots + A a_{m} x_{m} & Theorem MMSMM \\ = A (a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} + \dots + a_{m} x_{m}) & Theorem MMDAA \end{aligned}

In light of Theorem SLEMM, we can think of this equation as a homogeneous system of equations with

A

as the coefficient matrix. Then because

A

is nonsingular, Definition NM implies

a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} + \dots + a_{m} x_{m} = 0 .

This is a relation of linear dependence on

T,

a linearly independent set, so by Definition LICV,

a_{1} = a_{2} = a_{3} = \dots = a_{m} = 0 .

This means there is only a trivial relation of linear dependence on

R,

so again by Definition LICV we can conclude that

R

is linearly independent.

🔗

You have attempted 1 of 4 activities on this page.

🔗

Prev Top Next