Skip to main content

Section 2.2 Kernel and Image

Given any linear transformation T:VW we can associate two important sets: the kernel of T (also known as the nullspace), and the image of T (also known as the range).

Definition 2.2.1.

Let T:VW be a linear transformation. The kernel of T, denoted kerT, is defined by
kerT={vV|T(v)=0}.
The image of T, denoted imT, is defined by
imT={T(v)|vV}.

Remark 2.2.2.

Caution: the kernel is the set of vectors that T sends to zero. Saying that vkerT does not mean that v=0; it means that T(v)=0. Although it’s true that T(0)=0 for any linear transformation, the kernel can contain vectors other than the zero vector.
In fact, as we’ll see in Theorem 2.2.11 below, the case where the kernel contains only the zero vector is an important special case.

Remark 2.2.3. How to use these definitions.

As you read through the theorems and examples in this section, think carefully about how the definitions of kernel and image are used.
For a linear transformation T:VW:
  • If you assume vkerT: you are asserting that T(v)=0. Similarly, to prove vkerT, you must show that T(v)=0.
  • If your hypothesis is that U=kerT for some subspace UV, you can assume that T(u)=0 for any uU.
  • If you need to prove that U=kerT for some subspace U, then you need to prove that if uU, then T(u)=0, and if T(u)=0, then uU.
  • If you assume wimT: you are asserting that there exists some vV, such that T(v)=w, and to prove that wimT, you must find some vV such that T(v)=w.
  • If your hypothesis is that U=imT for some subspace UW, then you are assuming that for every uU, there is some vV such that T(v)=u.
  • If you need to prove that imT=U for some subspace U, then you need to show that for every uU, there is some vV with T(v)=u, and that T(v)U for every vV.

Strategy.

Both parts of the proof rely on the Subspace Test. So for each set, we first need to explain why it contains the zero vector. Next comes closure under addition: assume you have to elements of the set, then use the definition to explain what that means.
Now you have to show that the sum of those elements belongs to the set as well. It’s fairly safe to assume that this is going to involve the addition property of a linear transformation!
Scalar multiplication is handled similarly, but using the scalar multiplication property of T.

Proof.

  1. To show that kerT is a subspace, first note that 0kerT, since T(0)=0 for any linear transformation T. Now, suppose that v,wkerT; this means that T(v)=0 and T(w)=0, and therefore,
    T(v+w)=T(v)+T(w)=0+0=0.
    Similarly, for any scalar c, if vkerT then T(v)=0, so
    T(cv)=cT(v)=c0=0.
    By the subspace test, kerT is a subspace.
  2. Again, since T(0)=0, we see that 0imT. (That is, T(0V)=0W, so 0WimT.) Now suppose that w1,w2imT. This means that there exist v1,v2V such that T(v1)=w1 and T(v2)=w2. It follows that
    w1+w2=T(v1)+T(v2)=T(v1+v2),
    so w1+w2imT, since it’s the image of v1+v2. Similarly, if c is any scalar and w=T(v)imT, then
    cw=cT(v)=T(cv),
    so cwimT.

Example 2.2.5. Null space and column space.

A familiar setting that you may already have encountered in a previous linear algebra course (or Example 2.1.4) is that of a matrix transformation. Let A be an m×n matrix. Then we can define T:RnRm by T(x)=Ax, where elements of Rn,Rm are considered as column vectors. We then have
kerT=null(A)={xRn|Ax=0}
and
imT=col(A)={Ax|xRn},
where col(A) denotes the column space of A. Recall further that if we write A in terms of its columns as
A=[C1C2Cn]
and a vector xRn as x=[x1x2xn], then
Ax=x1C1+x2C2++xnCn.
Thus, any element of col(A) is a linear combination of its columns, explaining the name column space.
Determining null(A) and col(A) for a given matrix A is, unsurprisingly, a matter of reducing A to row-echelon form. Finding null(A) comes down to describing the set of all solutions to the homogeneous system Ax=0. Finding col(A) relies on the following theorem.
For a proof of this result, see Section 5.4 of Linear Algebra with Applications, by Keith Nicholson. The proof is fairly long and technical, so we omit it here.

Example 2.2.7.

Consider the linear transformation T:R4R3 defined by the matrix
A=[130221201826].
Let’s determine the RREF of A:
We see that there are leading ones in the first and second column. Therefore, col(A)=im(T)=span{[121],[318]}. Indeed, note that
[022]=65[121]+25[318]
and
[206]=25[121]45[318],
so that indeed, the third and fourth columns are in the span of the first and second.
Furthermore, we can determine the nullspace: if Ax=0 where x=[x1x2x3x4], then we must have
x1=65x325x4x2=25x3+45x4,
x=[65x325x425x3+45x4x3x4]=x35[6250]+x45[2405].
It follows that a basis for null(A)=kerT is {[6250],[2405]}.

Remark 2.2.8.

The SymPy library for Python has built-in functions for computing nullspace and column space. But it’s probably worth your while to know how to determine these from the RREF of a matrix, without additional help from the computer. That said, let’s see how the computer’s output compares to what we found in Example 2.2.7.
Note that the output from the computer simply states the basis for each space. Of course, for computational purposes, this is typically good enough.
An important result that comes out while trying to show that the “pivot columns” of a matrix (the ones that end up with leading ones in the RREF) are a basis for the column space is that the column rank (defined as the dimension of col(A)) and the row rank (the dimension of the space spanned by the rows of A) are equal. One can therefore speak unambiguously about the rank of a matrix A, and it is just as it’s defined in a first course in linear algebra: the number of leading ones in the RREF of A.
For a general linear transformation, we can’t necessarily speak in terms of rows and columns, but if T:VW is linear, and either V or W is finite-dimensional, then we can define the rank of T as follows.

Definition 2.2.9.

Let T:VW be a linear transformation. Then the rank of T is defined by
rankT=dimimT,
and the nullity of T is defined by
nullityT=dimkerT,
provided that the kernel and image of T are finite-dimensional.
Note that if W is finite-dimensional, then so is imT, since it’s a subspace of W. On the other hand, if V is finite-dimensional, then we can find a basis {v1,,vn} of V, and the set {T(v1),,T(vn)} will span imT, so again the image is finite-dimensional, so the rank of T is finite. It is possible for either the rank or the nullity of a transformation to be infinite.
Knowing that the kernel and image of an operator are subspaces gives us an easy way to define subspaces. From the textbook, we have the following nice example.

Exercise 2.2.10.

Let T:MnnMnn be defined by T(A)=AAT. Show that:

(a)

T is a linear map.
Hint.
You can use the fact that the transpose is linear: (A+B)T=AT+BT and (cA)T=cAT.

(b)

kerT is equal to the set of all symmetric matrices.
Hint.
A matrix is symmetric if AT=A, or in other words, AAT=0.

(c)

imT is equal to the set of all skew-symmetric matrices.
Hint.
A matrix is skew-symmetric if AT=A.
Recall that a function f:AB is injective (or one-to-one) if for any x1,x2A, f(x1)=f(x2) implies that x1=x2. (In other words, no two different inputs give the same output.) We say that f is surjective (or onto) if f(A)=B. (That is, if the range of f is the entire codomain B.) These properties are important considerations for the discussion of inverse functions.
For a linear transformation T, the property of surjectivity is tied to imT by definition: T:VW is onto if imT=W. What might not be immediately obvious is that the kernel tells us if a linear transformation is injective.

Strategy.

We have an “if and only if” statement, so we have to make sure to consider both directions. The basic idea is this: we know that 0 is always in the kernel, so if the kernel contains any other vector v, we would have T(v)=T(0)=0, and T would not be injective.
There is one trick to keep in mind: the statement T(v1)=T(v2) is equivalent to T(v1)T(v2)=0, and since T is linear, T(v1)T(v2)=T(v1v2).

Proof.

Suppose T is injective, and let vkerT. Then T(v)=0. On the other hand, we know that T(0)=0=T(v). Since T is injective, we must have v=0. Conversely, suppose that kerT={0} and that T(v1)=T(v2) for some v1,v2V. Then
0=T(v1)T(v2)=T(v1v2),
so v1v2kerT. Therefore, we must have v1v2=0, so v1=v2, and it follows that T is injective.

Exercise 2.2.12.

Rearrange the blocks below to produce a valid proof of the following statement:
If T:VW is injective and {v1,v2,,vn} is linearly independent in V, then {T(v1),T(v2),,T(vn)} is linearly independent in W.
Hint.
Remember that your goal is to show that {T(v1),,T(vn)} is linearly independent, so after you assume your hypotheses, you should begin by setting a linear combination of these vectors equal to zero.

Exercise 2.2.13.

Rearrange the blocks below to produce a valid proof of the following statement:
If T:VW is surjective and V=span{v1,,vn}, then W=span{T(v1),,T(vn)}.
Hint.
You need to show that any wW can be written as a linear combination of the T(vi). Because T is surjective, you know that w=T(v) for some vV.

Remark 2.2.14.

For the case of a matrix transformation TA:RnRm, notice that kerTA is simply the set of all solutions to Ax=0, while imTA is the set of all yRm for which Ax=y has a solution.
Recall from the discussion preceding Definition 2.2.9 that rankA=dimcol(A)=dimimTA. It follows that TA is surjective if and only if rankA=m. On the other hand, TA is injective if and only if rankA=n, because we know that the system Ax=0 has a unique solution if and only if each column of A contains a leading one.
This has some interesting consequences. If m=n (that is, if A is square), then each increase in dimnull(A) produces a corresponding decrease in dimcol(A), since both correspond to the “loss” of a leading one. Moreover, if rankA=n, then TA is both injective and surjective. Recall that a function is invertible if and only if it is both injective and surjective. It should come as no surprise that invertibility of TA (as a function) is equivalent to invertibility of A (as a matrix).
Also, note that if m<n, then rankAm<n, so TA could be surjective, but can’t possibly be injective. On the other hand, if m>n, then rankAn<m, so TA could be injective, but can’t possibly be surjective. These results generalize to linear transformations between any finite-dimensional vector spaces. The first step towards this is the following theorem, which is sometimes known as the Fundamental Theorem of Linear Transformations.

Proof.

The trick with this proof is that we aren’t assuming V is finite-dimensional, so we can’t start with a basis of V. But we do know that imT is finite-dimensional, so we start with a basis {w1,,wm} of imT. Of course, every vector in imT is the image of some vector in V, so we can write wi=T(vi), where viV, for i=1,2,,m.
Since {T(v1),,T(vm)} is a basis, it is linearly independent. The results of Exercise 2.1.1 tell us that the set {v1,,vm} must therefore be independent.
We now introduce a basis {u1,,un} of kerT, which we also know to be finite-dimensional. If we can show that the set {u1,,un,v1,,vm} is a basis for V, we’d be done, since the number of vectors in this basis is dimkerT+dimimT. We must therefore show that this set is independent, and that it spans V.
To see that it’s independent, suppose that
a1u1++anun+b1v1++bmvm=0.
Applying T to this equation, and noting that T(ui)=0 for each i, by definition of the ui, we get
b1T(v1)++bmT(vm)=0.
We assumed that the vectors T(vi) were independent, so all the bi must be zero. But then we get
a1u1++anun=0,
and since the ui are independent, all the ai must be zero.
To see that these vectors span, choose any xV. Since T(x)imT, there exist scalars c1,,cm such that
(2.2.1)T(x)=c1T(v1)++cmT(vm).
We’d like to be able to conclude from this that x=c1v1++cmvm, but this would be false, unless T was known to be injective (which it isn’t). Failure to be injective involves the kernel -- how do we bring that into the picture?
The trick is to realize that the reason we might have xc1v1++cmvm is that we’re off by something in the kernel. Indeed, (2.2.1) can be re-written as
T(xc1v1cmvm)=0,
so xc1v1cmvmkerT. But we have a basis for kerT, so we can write
xc1v1cmvm=t1u1++tnun
for some scalars t1,,tn, and this can be rearanged to give
x=t1u1++tnun+c1v1++cmvm,
which completes the proof.
This is sometimes known as the Rank-Nullity Theorem, since it can be stated in the form
dimV=rankT+nullityT.
We will see that this result is frequently useful for providing simple arguments that establish otherwise difficult results. A basic situation where the theorem is useful is as follows: we are given T:VW, where the dimensions of V and W are known. Since imT is a subspace of W, we know from Theorem 1.7.18 that T is onto if and only if dimimT=dimW. In many cases it is easier to compute kerT than it is imT, and the Dimension Theorem lets us determine dimimT if we know dimV and dimkerT.

Exercise 2.2.16.

    Select all statements below that are true:
  • If vkerT, then v=0.
  • Remember that vkerT implies T(v)=0; this does not necessarily mean v=0.
  • If T:R4R6 is injective, then it is surjective.
  • By the dimension theorem, the dimension of imT cannot be greater than 4, so T can never be surjective.
  • If T:R4P3(R) is injective, then it is surjective.
  • Correct! If T is injective, then dimkerT=0, so by the dimension theorem, dimimT=dimR4=4. Since dimP3(R)=4 as well, imT=P3(R).
  • It is possible to have an injective linear transformation T:R4R3.
  • The maximum dimension of imT is 3, so the minimum dimension of kerT is 1.
  • If T:VW is surjective, then dimVdimW.
  • Correct! If T is surjective, then imT=W, so dimV=dimkerT+dimimT=dimkerT+dimWdimW.
A useful consequence of this result is that if we know V is finite-dimensional, we can order any basis such that the first vectors in the list are a basis of kerT, and the images of the remaining vectors produce a basis of imT.
Another consequence of the dimension theorem is that we must have
dimkerTdimV and dimimTdimV.
Of course, we must also have dimimTdimW, since imT is a subspace of W. In the case of a matrix transformation TA, this means that the rank of TA is at most the minimum of dimV and dimW. This once again has consequences for the existence and uniqueness of solutions for linear systems with the coefficient matrix A.
We end with an exercise that is both challenging and useful. Do your best to come up with a proof before looking at the solution.

Exercise 2.2.17.

Let V and W be finite-dimensional vector spaces. Prove the following:

(a)

dimVdimW if and only if there exists an injection T:VW.
Hint.
You’re dealing with an “if and only if” statement, so be sure to prove both directions. One direction follows immediately from the dimension theorem.
What makes the other direction harder is that you need to prove an existence statement. To show that a transformation with the required property exists, you’re going to need to construct it! To do so, try defining your transformation in terms of a basis.

(b)

dimVdimW if and only if there exists a surjection T:VW.
Hint.
The hint from the previous part also applies here!

Exercises Exercises

1.

Let
A=[144123021123].
Find a basis for the image of A (or, equivalently, for the linear transformation T(x)=Ax).

2.

Let
A=[414121121828].
Find dimensions of the kernel and image of T(x)=Ax.

3.

Let
A=[1225012239021].
Find a vector w in R3 that is not in the image of the transformation xAx.

4.

Suppose AM2,3(R) is a matrix and
Ae1=[32],Ae2=[31], and Ae3=[30].
  1. What is A(5e13e24e3)?
  2. Find the matrix for the linear transformation f (relative to the standard basis in the domain and codomain). That is, find the matrix A such that f(x)=Ax.
  3. Find a formula for the linear transformation f.
  4. Find bases (i.e., maximal independent sets) for the kernel and image of f.
  5. The linear transformation f is:
    • injective
    • surjective
    • bijective
    • none of these

5.

Suppose f:R2R3 is the function defined by f(x,y)=[5x+3y4y+3x3x].
  1. What is f(2,5)? Enter your answer as a coordinate vector of the form 1,2,3.
  2. If f is a linear transformation, find the matrix A such that f(x)=Ax.
  3. Find bases (i.e., minimal spanning sets) for the kernel and image of f.
  4. The linear transformation f is:
    • injective
    • surjective
    • bijective
    • none of these

6.

Let f:R2R2 be the linear transformation defined by f(x,y)=5x+y,3x3y.
  1. Find the matrix of the linear transformation f.
  2. The linear transformation f is
    • injective
    • surjective
    • bijective
    • none of these
  3. If f is bijective, find the matrix of its inverse.

7.

Let f:R2R3 be the linear transformation determined by
f(10)=(431)f(01)=(352)
  1. Find f(89).
  2. Find the matrix of the linear transformation f.
  3. The linear transformation f is
    • injective
    • surjective
    • bijective
    • none of these

8.

Let f:R2R3 be the linear transformation determined by
f(e1)=(9123),f(e2)=(15205).
  1. Find f(41)
  2. Find the matrix of the linear transformation f.
  3. The linear transformation f is
    • injective
    • surjective
    • bijective
    • none of these

9.

A linear transformation T:R3R2 whose matrix is
[111330+k]
is onto if and only if k.

10.

Let L:R3R3 be the linear operator defined by
L(x)=[12x124x2+12x312x236x1+36x3]
(a) Find the dimension of the range of L:
(b) Find the dimension of the kernel of L:
(c) Let S be the subspace of R3 spanned by 11e1 and 24e2+24e3. Find the dimension of L(S):

11.

Let Pn be the vector space of all polynomials of degree n or less in the variable x. Let D2:P4P2 be the linear transformation that takes a polynomial to its second derivative. That is, D2(p(x))=p(x) for any polynomial p(x) of degree 4 or less.
  • Find a basis for the kernel of D2.
  • Find a basis for the image of D2.
You have attempted 1 of 20 activities on this page.