Skip to main content

Section 3.1 Orthogonal sets of vectors

You may recall from elementary linear algebra, or a calculus class, that vectors in R2 or R3 are considered to be quantities with both magnitude and direction. Interestingly enough, neither of these properties is inherent to a general vector space. The vector space axioms specify only algebra; they say nothing about geometry. (What, for example, should be the β€œangle” between two polynomials?)
Because vector algebra is often introduced as a consequence of geometry (like the β€œtip-to-tail” rule), you may not have thought all that carefully about what, exactly, is responsible for making the connection between algebra and geometry. It turns out that the missing link is the humble dot product.
You probably encountered the following result, perhaps as a consequence of the law of cosines: for any two vectors u,v∈R2,
uβ‹…v=β€–uβ€–β€–vβ€–cos⁑θ,
where ΞΈ is the angle between u and v. Here we see both magnitude and direction (encoded by the angle) defined in terms of the dot product.
While it is possible to generalize the idea of the dot product to something called an inner product, we will first focus on the basic dot product in Rn. Once we have a good understanding of things in that setting, we can move on to consider the abstract counterpart.

Subsection 3.1.1 Basic definitions and properties

For most of this chapter (primarily for typographical reasons) we will denote elements of Rn as ordered n-tuples (x1,…,xn) rather than as column vectors.

Definition 3.1.1.

Let x=(x1,x2,…,xn) and y=(y1,y2,…,yn) be vectors in Rn. The dot product of x and y, denoted by xβ‹…y is the scalar defined by
xβ‹…y=x1y1+x2y2+β‹―+xnyn.
The norm of a vector x is denoted β€–xβ€– and defined by
β€–xβ€–=x12+x22+β‹―+xn2.
Note that both the dot product and the norm produce scalars. Through the Pythagorean Theorem, we recognize the norm as the length of x. The dot product can still be thought of as measuring the angle between vectors, although the simple geometric proof used in two dimensions is not that easily translated to n dimensions. At the very least, the dot product lets us extend the notion of right angles to higher dimensions.

Definition 3.1.2.

We say that two vectors x,y∈Rn are orthogonal if xβ‹…y=0.
It should be no surprise that all the familiar properties of the dot product work just as well in any dimension. The folowing properties can be confirmed by direct computation, so the proof is left as an exercise.

Remark 3.1.4.

The above properties, when properly abstracted, become the defining properties of a (real) inner product. (A complex inner product also involves complex conjugates.) For a general inner product, the requirement xβ‹…xβ‰₯0 is referred to as being positive-definite, and the property that only the zero vector produces zero when dotted with itself is called nondegenerate. Note that we have the following connection between norm and dot product:
β€–xβ€–2=xβ‹…x.
For a general inner product, this can be used as a definition of the norm associated to an inner product.

Exercise 3.1.5.

Show that for any vectors x,y∈Rn, we have
β€–x+yβ€–2=β€–xβ€–2+2xβ‹…y+β€–yβ€–2.
Hint.
Use properties of the dot product to expand and simplify.

Exercise 3.1.6.

Suppose Rn=span{v1,v2,…,vk}. Prove that x=0 if and only if xβ‹…vi=0 for each i=1,2,…,k.
Hint.
Don’t forget to prove both directions! Note that the hypothesis allows you to write x as a linear combination of the vi.
There are two important inequalities associated to the dot product and norm. We state them both in the following theorem, without proof.
The first of the above inequalities is called the Cauchy-Schwarz inequality, which be viewed as a manifestation of the formula
xβ‹…y=β€–xβ€–β€–yβ€–cos⁑θ,
since after all, |cos⁑θ|≀1 for any angle ΞΈ.
The usual proof involves some algebraic trickery; the interested reader is invited to search online for the Cauchy-Schwarz inequality, where they will find no shortage of websites offering proofs.
The second result, called the triangle inequality, follows immediately from the Cauchy-Scwarz inequality and Exercise 3.1.5:
β€–x+yβ€–2=β€–xβ€–2+2xβ‹…y+β€–y2‖≀‖xβ€–2+2β€–xβ€–β€–yβ€–+β€–yβ€–2=(β€–xβ€–+β€–yβ€–)2.
The triangle inequality gets its name from the β€œtip-to-tail” picture for vector addition. Essentially, it tells us that the length of any side of a triangle must be less than the sum of the lengths of the other two sides. The importance of the triangle inequality is that it tells us that the norm can be used to define distance.

Definition 3.1.8.

For any vectors x,y∈Rn, the distance from x to y is denoted d(x,y), and defined as
d(x,y)=β€–xβˆ’yβ€–.

Remark 3.1.9.

Using properties of the norm, we can show that this distance function meets the criteria of what’s called a metric. A metric is any function that takes a pair of vectors (or points) as input, and returns a number as output, with the following properties:
  1. d(x,y)=d(y,x) for any x,y
  2. d(x,y)β‰₯0, and d(x,y)=0 if and only if x=y
  3. d(x,y)≀d(x,z)+d(z,y) for any x,y,z
We leave it as an exercise to confirm that the distance function defined above is a metric.
In more advanced courses (e.g. topology or analysis) you might go into detailed study of these structures. There are three interrelated structures: inner products, norms, and metrics. You might consider questions like: does every norm come from an inner product? Does every metric come from a norm? (No.) Things get even more interesting for infinite-dimensional spaces. Of special interest are spaces such as Hilbert spaces (a special type of infinite-dimensional inner product space) and Banach spaces (a special type of infinite-dimensional normed space).

Exercise 3.1.10.

    Select all vectors that are orthogonal to the vector (2,1,βˆ’3)
  • (1,1,1)
  • Yes! 2(1)+1(1)βˆ’3(1)=0.
  • (3,1,2)
  • You should find that the dot product is 1, not 0, so these vectors are not orthogonal.
  • (0,0)
  • You might be tempted to say that the zero vector is orthogonal to everything, but we can’t compare vectors from different vector spaces!
  • (0,βˆ’3,βˆ’1)
  • Yes! We have to be careful of signs here: 2(0)+1(βˆ’3)+(βˆ’3)(βˆ’1)=0βˆ’3+3=0.

Exercise 3.1.11.

    If u is orthogonal to v and v is orthogonal to w, then u is orthogonal to w.
  • True.

  • Consider u=(1,0,0), v=(0,1,0), and w=(1,0,1).
  • False.

  • Consider u=(1,0,0), v=(0,1,0), and w=(1,0,1).

Subsection 3.1.2 Orthogonal sets of vectors

In Chapter 1, we learned that linear independence and span are important concepts associated to a set of vectors. In this chapter, we learn what it means for a set of vectors to be orthogonal, and try to understand why this concept is just as important as independence and span.

Definition 3.1.12.

A set of vectors {v1,v2,…,vk} in Rn is called orthogonal if:
  • viβ‰ 0 for each i=1,2…,k
  • viβ‹…vj=0 for all iβ‰ j

Exercise 3.1.13.

Show that the following is an orthogonal subset of R4.
{(1,0,1,0),(βˆ’1,0,1,1),(1,1,βˆ’1,2)}
Can you find a fourth vector that is orthogonal to each vector in this set?
Hint.
The dot product of the fourth vector with each vector above must be zero. Can you turn this requirement into a system of equations?

Exercise 3.1.14.

    If {v,w} and {x,y} are orthogonal sets of vectors in Rn, then {v,w,x,y} is an orthogonal set of vectors.
  • True.

  • Try to construct an example. The vector x has to be orthogonal to y, but is there any reason it has to be orthogonal to v or w?
  • False.

  • Try to construct an example. The vector x has to be orthogonal to y, but is there any reason it has to be orthogonal to v or w?
The requirement that the vectors in an orthogonal set be nonzero is partly because the alternative would be boring, and partly because it lets us state the following theorem.

Strategy.

Any proof of linear independence should start by defining our set of vectors, and assuming that a linear combination of these vectors is equal to the zero vector, with the goal of showing that the scalars have to be zero.
Set up the equation (say, c1v1+β‹―cnvn=0), with the assumption that your set of vectors is orthogonal. What happens if you take the dot product of both sides with one of these vectors?

Proof.

Suppose S={v1,v2,…,vk} is orthogonal, and suppose
c1v1+c2v2+β‹―+ckvk=0
for scalars c1,c2,…,ck. Taking the dot product of both sides of the above equation with v1 gives
c1(v1β‹…v1)+c2(v1β‹…v2)+β‹―+ck(v1β‹…vk)=v1β‹…0c1β€–v1β€–2+0+β‹―+0=0.
Since β€–v1β€–2β‰ 0, we must have c1=0. We similarly find that all the remaining scalars are zero by taking the dot product with v2,…,vk.
Another useful consequence of orthogonality: in two dimensions, we have the Pythagorean Theorem for right-angled triangles. If the β€œlegs” of the triangle are identified with vectors x and y, and the hypotenuse with z, then β€–xβ€–2+β€–yβ€–2=β€–zβ€–2, since xβ‹…y=0.
In n dimensions, we have the following, which follows from the fact that all β€œcross terms” (dot products of different vectors) will vanish.

Strategy.

Remember that
β€–x1+β‹―+xkβ€–2=(x1+β‹―+xk)β‹…(x1+β‹―+xk),
and use the distributive property of the dot product, along with the fact that each pair of different vectors is orthogonal.
Our final initial result about orthogonal sets of vectors relates to span. In general, we know that if y∈span{x1,…,xk}, then it is possible to solve for scalars c1,…,ck such that y=c1x1+β‹―+ckxk. The trouble is that finding these scalars generally involves setting up, and then solving, a system of linear equations. The great thing about orthogonal sets of vectors is that we can provide explicit formulas for the scalars.

Strategy.

Take the same approach you used in the proof of Theorem 3.1.15, but this time, with a nonzero vector on the right-hand side.

Proof.

Let y=c1v1+β‹―+ckvk. Taking the dot product of both sides of this equation with vi gives
viβ‹…y=ci(viβ‹…vi),
since the dot product of vi with vj for i≠j is zero.
One use of Theorem 3.1.17 is determining whether or not a given vector is in the span of an orthogonal set. If it is in the span, then its coefficients must satisfy the Fourier expansion formula. Therefore, if we compute the right hand side of the above formula and do not get our original vector, then that vector must not be in the span.

Exercise 3.1.18.

Determine whether or not the vectors v=(1,βˆ’4,3,βˆ’11),w=(3,1,βˆ’4,2) belong to the span of the vectors x1=(1,0,1,0),x2=(βˆ’1,0,1,1),x3=(1,1,βˆ’1,2).
(We confirmed that {x1,x2,x3} is an orthogonal set in Exercise 3.1.13.)
The Fourier expansion is especially simple if our basis vectors have norm one, since the denominators in each coefficient disappear. Recall that a unit vector in Rn is any vector x with β€–xβ€–=1. For any nonzero vector v, a unit vector (that is, a vector of norm one) in the direction of v is given by
u^=1β€–vβ€–v.
We often say that the vector u is normalized. (The convention of using a β€œhat” for unit vectors is common but not universal.)

Exercise 3.1.19.

Definition 3.1.20.

A basis B of Rn is called an orthonormal basis if B is orthogonal, and all the vectors in B are unit vectors.

Example 3.1.21.

In Exercise 3.1.13 we saw that the set
{(1,0,1,0),(βˆ’1,0,1,1),(1,1,βˆ’1,2),(1,βˆ’6,βˆ’1,2)}
is orthogonal. Since it’s orthogonal, it must be independent, and since it’s a set of four independent vectors in R4, it must be a basis. To get an orthonormal basis, we normalize each vector:
u^1=112+02+12+02(1,0,1,0)=12(1,0,1,0)u^2=1(βˆ’1)2+02+12+12(βˆ’1,0,1,1,)=13(βˆ’1,0,1,1)u^3=112+12+(βˆ’1)2+22(1,1,βˆ’1,2)=17(1,1,βˆ’1,2)u^4=112+(βˆ’6)2+(βˆ’1)2+22(1,βˆ’6,βˆ’1,2)=142(1,βˆ’6,βˆ’1,2).
The set {u^1,u^2,u^3,u^4} is then an orthonormal basis of R4.
The process of creating unit vectors does typically introduce square root coefficients in our vectors. This can seem undesirable, but there remains value in having an orthonormal basis. For example, suppose we wanted to write the vector v=(3,5,βˆ’1,2) in terms of our basis. We can quickly compute
vβ‹…u^1=32βˆ’12=2vβ‹…u^2=βˆ’33βˆ’13+23=βˆ’23vβ‹…u^3=37+57+17+47=117vβ‹…u^4=342βˆ’3042+142+442=βˆ’2242,
and so
v=2u^1βˆ’23u^2+117u^3βˆ’2242u^4.
There’s still work to be done, but it is comparatively simpler than solving the corresponding system of equations.

Exercises 3.1.3 Exercises

1.

Let {eβ†’1, eβ†’2, eβ†’3, eβ†’4, eβ†’5, eβ†’6} be the standard basis in R6. Find the length of the vector xβ†’=5eβ†’1+2eβ†’2+3eβ†’3βˆ’3eβ†’4βˆ’2eβ†’5βˆ’3eβ†’6.

2.

Find the norm of xβ†’ and the unit vector uβ†’ in the direction of xβ†’ if xβ†’=[52βˆ’2βˆ’3].
‖x→‖= , u→=

3.

Given that β€–xβ€–=2,β€–yβ€–=1, and xβ‹…y=5, compute (5xβˆ’3y)β‹…(x+5y).
Hint.
Use properties of the dot product to expand and simplify.

4.

Let u1,u2,u3 be an orthonormal basis for an inner product space V. If
v=au1+bu2+cu3
is so that β€–vβ€–=26, v is orthogonal to u3, and ⟨v,u2⟩=βˆ’26, find the possible values for a, b, and c.
a= , b= , c=

5.

Find two linearly independent vectors perpendicular to the vector vβ†’=[βˆ’257].
You have attempted 1 of 14 activities on this page.