The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.
Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by tsegayegeze, 2016-03-16 12:49:22

Elementary_Linear_Algebra_with_Applications_Anton__9_edition

Let and be two functions in and define

(6)

This is well-defined since the functions in are continuous. We shall show that this formula defines an inner product on

by verifying the four inner product axioms for functions , , and in :

1.

which proves that Axiom 1 holds.
2.

which proves that Axiom 2 holds.
3.

which proves that Axiom 3 holds.

4. If is any function in , then for all x in ; therefore,

Further, because and is continuous on , it follows that if and
if and only if . This
only if for all x in . Therefore, we have

proves that Axiom 4 holds.

Calculus Required

EXAMPLE 10 Norm of a Vector in relative to this inner

If has the inner product defined in the preceding example, then the norm of a function (7)
product is , which on squaring yields

and the unit sphere in this space consists of all functions f in that satisfy the equation

Calculus Required

Remark Since polynomials are continuous functions on , they are continuous on any closed interval . Thus,
for all such intervals the vector space is a subspace of , and Formula 6 defines an inner product on .

Calculus Required

Remark Recall from calculus that the arc length of a curve over an interval is given by the formula

Do not confuse this concept of arc length with , which is the length (norm) of f when f is viewed as a vector in (8)
Formulas 7 and 8 are quite different. .
The following theorem lists some basic algebraic properties of inner products.

THEOREM 6.1.1

Properties of Inner Products
If u, v, and w are vectors in a real inner product space, and k is any scalar, then

(a)
(b)
(c)
(d)
(e)

Proof We shall prove part and leave the proofs of the remaining parts as exercises.

The following example illustrates how Theorem 6.1.1 and the defining properties of inner products can be used to perform
algebraic computations with inner products. As you read through the example, you will find it instructive to justify the steps.

EXAMPLE 11 Calculating with Inner Products

Since Theorem 6.1.1 is a general result, it is guaranteed to hold for all real inner product spaces. This is the real power of the
axiomatic development of vector spaces and inner products—a single theorem proves a multitude of results at once. For example,
we are guaranteed without any further proof that the five properties given in Theorem 6.1.1 are true for the inner product on
generated by any matrix A [Formula 3]. For example, let us check part (b) of Theorem 6.1.1 for this inner product:

The reader will find it instructive to check the remaining parts of Theorem 6.1.1 for this inner product.

Exercise Set 6.1

Click here for Just Ask!

Let be the Euclidean inner product on , and let ,, , and . Verify that
1.

(a)

(b)

(c)

(d)

(e)

Repeat Exercise 1 for the weighted Euclidean inner product .
2.

Compute using the inner product in Example 7.
3.

(a)

(b)

Compute using the inner product in Example 8.
4.

(a)

(b)

5. is the inner product on generated by
(a) Use Formula 3 to show that

(b) Use the inner product in part (a) to compute if and .
is the inner product on generated by
6.
(a) Use Formula 3 to show that

(b) Use the inner product in part (a) to compute if and .

Let and . In each part, the given expression is an inner product on . Find a matrix that generates it.
7.

(a)

(b)

Let and . Show that the following are inner products on by verifying that the inner product axioms
8. hold.

(a) . Determine which of the following are inner products on . For those that are not, list

(b)

Let and
9. the axioms that do not hold.

(a)
(b)

(c)
(d)

In each part, use the given inner product on to find , where .
10.

(a) the Euclidean inner product

(b) the weighted Euclidean inner product , where and

(c) the inner product generated by the matrix

Use the inner products in Exercise 10 to find for and .
11.

Let have the inner product in Example 8. In each part, find
12.

(a)

(b)

Let have the inner product in Example 7. In each part, find .
13.

(a)

(b)

Let have the inner product in Example 8. Find .
14.

Let have the inner product in Example 7. Find .
15.

(a)

(b)

Suppose that u, v, and w are vectors such that
16.

Evaluate the given expression.

(a)
(b)
(c)
(d)
(e)
(f)

17. (For Readers Who Have Studied Calculus)
Let the vector space have the inner product

(a) Find for , , .

(b) Find if and .

Sketch the unit circle in using the given inner product.
18.

(a)

(b)

Find a weighted Euclidean inner product on for which the unit circle is the ellipse shown in the accompanying figure.
19.

Figure Ex-19

Show that the following identity holds for vectors in any inner product space.
20.

Show that the following identity holds for vectors in any inner product space.
21.

22. Let and . Show that is not an inner product on .

Let and be polynomials in . Show that
23.

is an inner product on . Is this an inner product on ? Explain.

Prove: If is the Euclidean inner product on , and if A is an matrix, then
24.

Hint Use the fact that .

Verify the result in Exercise 24 for the Euclidean inner product on and
25.

Let and . Show that
26. are positive real numbers.

is an inner product on if , , …,

27. (For Readers Who Have Studied Calculus)
Use the inner product

to compute , for the vectors and in .

(a)
(b)

28. (For Readers Who Have Studied Calculus)
In each part, use the inner product

to compute , for the vectors and in .

(a)
(b)
(c)

Show that the inner product in Example 7 can be written as .
29.

Prove that Formula 3 defines an inner product on .
30.

Hint Use the alternative version of Formula 3 given by 4.

Show that matrix 5 generates the weighted Euclidean inner product .
31.

The following is a proof of part (c) of Theorem 6.1.1. Fill in each blank line with the name of an
32. inner product axiom that justifies the step.

Hypothesis: Let u and v be vectors in a real inner product space.

Conclusion: .

Proof:

1. _________

2. _________

3. _________

Prove parts (a), (d ), and (e) of Theorem 6.1.1, justifying each step with the name of a vector space
33. axiom or by referring to previously established results.

Create a weighted Euclidean inner product on for which the unit circle

34. in an -coordinate system is the ellipse shown in the accompanying figure.

Figure Ex-34
Generalize the result of Problem 34 for an ellipse with semimajor axis a and semiminor axis b, with
35. a and b positive.

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.

6.2 In this section we shall define the notion of an angle between two vectors in
an inner product space, and we shall use this concept to obtain some basic
ANGLE AND relations between vectors in an inner product, including a fundamental
ORTHOGONALITY IN geometric relationship between the nullspace and column space of a matrix.
INNER PRODUCT
SPACES

Cauchy–Schwarz Inequality

Recall from Formula 1 of Section 3.3 that if u and v are nonzero vectors in or and is the angle between them, then

(1)

or, alternatively,

(2)

Our first goal in this section is to define the concept of an angle between two vectors in a general inner product space. For such
a definition to be reasonable, we would want it to be consistent with Formula 2 when it is applied to the special case of and

with the Euclidean inner product. Thus we will want our definition of the angle between two nonzero vectors in an inner
product space to satisfy the relationship

(3)

However, because , there would be no hope of satisfying 3 unless we were assured that every pair of nonzero vectors

in an inner product space satisfies the inequality

Fortunately, we will be able to prove that this is the case by using the following generalization of the Cauchy–Schwarz
inequality (see Theorem 4.1.3).

THEOREM 6.2.1

Cauchy–Schwarz Inequality
If u and v are vectors in a real inner product space, then

(4)

Proof We warn the reader in advance that the proof presented here depends on a clever trick that is not easy to motivate. If

, then , so the two sides of 4 are equal. Assume now that . Let , and

, and let t be any real number. By the positivity axiom, the inner product of any vector with itself is always

nonnegative. Therefore,

This inequality implies that the quadratic polynomial has either no real roots or a repeated real root. Therefore, its

discriminant must satisfy the inequality . Expressing the coefficients a, b, and c in terms of the vectors u and v

gives 4 , or, equivalently,

Taking square roots of both sides and using the fact that and are nonnegative yields
which completes the proof.
For reference, we note that the Cauchy–Schwarz inequality can be written in the following two alternative forms:

(5)

(6)

The first of these formulas was obtained in the proof of Theorem 6.2.1, and the second is derived from the first using the fact
that and .

EXAMPLE 1 Cauchy–Schwarz Inequality in to be the

The Cauchy–Schwarz inequality for (Theorem 4.1.3) follows as a special case of Theorem 6.2.1 by taking
Euclidean inner product .

The next two theorems show that the basic properties of length and distance that were established in Theorems 4.1.4 and 4.1.5
for vectors in Euclidean n-space continue to hold in general inner product spaces. This is strong evidence that our definitions of
inner product, length, and distance are well chosen.

THEOREM 6.2.2

Properties of Length
If u and v are vectors in an inner product space V, and if k is any scalar, then

(a)

(b)

(c)
(d)

THEOREM 6.2.3
Properties of Distance
If u, v, and w are vectors in an inner product space V, and if k is any scalar, then
(a)
(b)
(c)
(d)

We shall prove part (d) of Theorem 6.2.2 and leave the remaining parts of Theorems Theorem 6.2.2 and Theorem 6.2.3 as
exercises.

Proof of Theorem 6.2.2d By definition,

Taking square roots gives .

Angle Between Vectors

We shall now show how the Cauchy–Schwarz inequality can be used to define angles in general inner product spaces. Suppose

that u and v are nonzero vectors in an inner product space V. If we divide both sides of Formula 6 by , we obtain

or, equivalently,

Now if is an angle whose radian measure varies from 0 to , then (7)
exactly once (Figure 6.2.1). assumes every value between −1 and 1 inclusive

Figure 6.2.1

Thus, from 7, there is a unique angle such that

(8)
We define to be the angle between u and v. Observe that in or with the Euclidean inner product, 8 agrees with the
usual formula for the cosine of the angle between two nonzero vectors [Formula 2].

EXAMPLE 2 Cosine of an Angle Between Two Vectors in and
Let have the Euclidean inner product. Find the cosine of the angle between the vectors

.

Solution

We leave it for the reader to verify that

so that

Orthogonality

Example 2 is primarily a mathematical exercise, for there is relatively little need to find angles between vectors, except in

and with the Euclidean inner product. However, a problem of major importance in all inner product spaces is to determine

whether two vectors are orthogonal—that is, whether the angle between them is .

It follows from 8 that if u and v are nonzero vectors in an inner product space and is the angle between them, then if

and only if . Equivalently, for nonzero vectors we have if and only if . If we agree to consider the

angle between u and v to be when either or both of these vectors is 0, then we can state without exception that the angle

between u and v is if and only if . This suggests the following definition.

DEFINITION

Two vectors u and v in an inner product space are called orthogonal if .

Observe that in the special case where is the Euclidean inner product on , this definition reduces to the

definition of orthogonality in Euclidean n-space given in Section 4.1. We also emphasize that orthogonality depends on the

inner product; two vectors can be orthogonal with respect to one inner product but not another.

EXAMPLE 3 Orthogonal Vectors in
If has the inner product of Example 7 in the preceding section, then the matrices

are orthogonal, since

Calculus Required

EXAMPLE 4 Orthogonal Vectors in
Let have the inner product

and let and . Then

Because , the vectors and are orthogonal relative to the given inner product.

In Section 4.1 we proved the Theorem of Pythagoras for vectors in Euclidean n-space. The following theorem extends this
result to vectors in any inner product space.

THEOREM 6.2.4

Generalized Theorem of Pythagoras
If u and v are orthogonal vectors in an inner product space, then

Proof The orthogonality of u and v implies that , so

Calculus Required

EXAMPLE 5 Theorem of Pythagoras in

In Example 4 we showed that and are orthogonal relative to the inner product

on . It follows from the Theorem of Pythagoras that
Thus, from the computations in Example 4, we have

We can check this result by direct integration:

Orthogonal Complements

If V is a plane through the origin of with the Euclidean inner product, then the set of all vectors that are orthogonal to every
vector in V forms the line L through the origin that is perpendicular to V (Figure 6.2.2). In the language of linear algebra we say
that the line and the plane are orthogonal complements of one another. The following definition extends this concept to general
inner product spaces.

Figure 6.2.2
Every vector in L is orthogonal to every vector in V.

DEFINITION

Let W be a subspace of an inner product space V. A vector u in V is said to be orthogonal to W if it is orthogonal to every
vector in W, and the set of all vectors in V that are orthogonal to W is called the orthogonal complement of W.

Recall from geometry that the symbol is used to indicate perpendicularity. In linear algebra the orthogonal complement of a
subspace W is denoted by . (read “W perp”). The following theorem lists the basic properties of orthogonal complements.

THEOREM 6.2.5

Properties of Orthogonal Complements
If W is a subspace of a finite-dimensional inner product space V, then

(a) is a subspace of V.

(b) The only vector common to W and is 0.

(c) The orthogonal complement of is W; that is, .

We shall prove parts (a) and (b). The proof of (c) requires results covered later in this chapter, so its proof is left for the
exercises at the end of the chapter.

Proof (a) Note first that for every vector w in W, so contains at least the zero vector. We want to show that

is closed under addition and scalar multiplication; that is, we want to show that the sum of two vectors in is

orthogonal to every vector in W and that any scalar multiple of a vector in is orthogonal to every vector in W. Let u and v

be any vectors in , let k be any scalar, and let w be any vector in W. Then, from the definition of , we have

and . Using basic properties of the inner product, we have

which proves that and are in .

Proof (b) If v is common to W and , then , which implies that by Axiom 4 for inner products.

Remark Because W and are orthogonal complements of one another by part (c) of the preceding theorem, we shall say
that W and are orthogonal complements.

A Geometric Link between Nullspace and Row Space

The following fundamental theorem provides a geometric link between the nullspace and row space of a matrix.

THEOREM 6.2.6

If A is an matrix, then

(a) The nullspace of A and the row space of A are orthogonal complements in with respect to the Euclidean inner
product.

(b) The nullspace of and the column space of A are orthogonal complements in with respect to the Euclidean
inner product.

Proof (a) We want to show that the orthogonal complement of the row space of A is the nullspace of A. To do this, we must

show that if a vector v is orthogonal to every vector in the row space, then , and conversely, that if , then v is

orthogonal to every vector in the row space.

Assume first that v is orthogonal to every vector in the row space of A. Then in particular, v is orthogonal to the row vectors ,
, …, of A; that is,

(9)

But by Formula 11 of Section 4.1, the linear system can be expressed in dot product notation as

(10)

so it follows from 9 that v is a solution of this system and hence lies in the nullspace of A.

Conversely, assume that v is a vector in the nullspace of A, so . It follows from 10 that

But if r is any vector in the row space of A, then r is expressible as a linear combination of the row vectors of A, say

Thus

which proves that v is orthogonal to every vector in the row space of A.

Proof (b) Since the column space of A is the row space of (except for a difference in notation), the proof follows by
applying the result in part (a) to .

The following example shows how Theorem 6.2.6 can be used to find a basis for the orthogonal complement of a subspace of
Euclidean n-space.

EXAMPLE 6 Basis for an Orthogonal Complement
Let W be the subspace of spanned by the vectors

Find a basis for the orthogonal complement of W.

Solution

The space W spanned by , , , and is the same as the row space of the matrix

and by part (a) of Theorem 6.2.6, the nullspace of A is the orthogonal complement of A. In Example 4 of Section 5.5 we
showed that

form a basis for this nullspace. Expressing these vectors in the same notation as , , , and , we conclude that the
vectors
form a basis for the orthogonal complement of W. As a check, the reader may want to verify that and are orthogonal to
, , , and by calculating the necessary dot products.

Summary

We leave it for the reader to show that in any inner product space V, the zero space {0} and the entire space V are orthogonal

complements. Thus, if A is an matrix, to say that has only the trivial solution is equivalent to saying that the

orthogonal complement of the nullspace of A is all of , or, equivalently, that the rowspace of A is all of . This enables us

to add two new results to the seventeen listed in Theorem 5.6.9.

THEOREM 6.2.7

Equivalent Statements is multiplication by A, then the following are equivalent.
If A is an matrix, and if

(a) A is invertible.

(b) has only the trivial solution.

(c) The reduced row-echelon form of A is .

(d) A is expressible as a product of elementary matrices.
(e) is consistent for every matrix b.
(f) has exactly one solution for every matrix b.
(g) .

(h) The range of is .

(i) is one-to-one.

(j) The column vectors of A are linearly independent.
(k) The row vectors of A are linearly independent.
(l) The column vectors of A span .
(m) The row vectors of A span .
(n) The column vectors of A form a basis for .

(o) The row vectors of A form a basis for .
(p) A has rank n.
(q) A has nullity 0.
(r) The orthogonal complement of the nullspace of A is .
(s) The orthogonal complement of the row space of A is {0}.

This theorem relates all of the major topics we have studied thus far.

Exercise Set 6.2

Click here for Just Ask!

In each part, determine whether the given vectors are orthogonal with respect to the Euclidean inner product.
1.

(a) ,

(b) ,

(c) ,

(d) ,

(e) ,

(f) ,

Do there exist scalars k, l such that the vectors , , and are mutually orthogonal with
2. respect to the Euclidean inner product?
and . If , what is k?
Let have the Euclidean inner product. Let
3.

Let have the Euclidean inner product, and let . Determine whether the vector u is orthogonal to the
, and
4. subspace spanned by the vectors ,

Let , , and have the Euclidean inner product. In each part, find the cosine of the angle between u and v.
5.

(a) ,

(b) ,

(c) ,

(d) ,

(e) ,

(f) ,

Let have the inner product in Example 8 of Section 6.1. Find the cosine of the angle between p and q.
6.

(a) ,

(b) ,

Show that and are orthogonal with respect to the inner product in Exercise 6.
7.

Let have the inner product in Example 7 of Section 6.1. Find the cosine of the angle between A and B.
8.

(a) ,

(b) ,

Let
9.

Which of the following matrices are orthogonal to A with respect to the inner product in Exercise 8?

(a)
(b)
(c)
(d)

Let have the Euclidean inner product. For which values of k are u and v orthogonal?
10.

(a) ,

(b) ,

Let have the Euclidean inner product. Find two unit vectors that are orthogonal to the three vectors ,
11. , and .

In each part, verify that the Cauchy–Schwarz inequality holds for the given vectors using the Euclidean inner product.
12.

(a) ,

(b) ,

(c) ,

(d) ,

In each part, verify that the Cauchy–Schwarz inequality holds for the given vectors.
13.

(a) and using the inner product of Example 2 of Section 6.1

(b) using the inner product in Example 7 of Section 6.1

(c) and using the inner product given in Example 8 of Section 6.1

Let W be the line in with equation . Find an equation for .
14.

15. . Find parametric equations for .
(a) Let W be the plane in with equation

(b) Let W be the line in with parametric equations

Find an equation for .
(c) Let W be the intersection of the two planes

in . Find an equation for .
Let
16.

(a) Find bases for the row space and nullspace of A.

(b) Verify that every vector in the row space is orthogonal to every vector in the nullspace (as guaranteed by Theorem
6.2.6a).

Let A be the matrix in Exercise 16.
17.

(a) Find bases for the column space of A and nullspace of .

(b) Verify that every vector in the column space of A is orthogonal to every vector in the nullspace of (as
guaranteed by Theorem 6.2.6b).

Find a basis for the orthogonal complement of the subspace of spanned by the vectors.
18.

(a) , ,

(b) ,

(c) , ,

(d) , , ,

Let V be an inner product space. Show that if u and v are orthogonal unit vectors in V, then .
19.

Let V be an inner product space. Show that if w is orthogonal to both and , it is orthogonal to for all

20. scalars and . Interpret this result geometrically in the case where V is with the Euclidean inner product.

Let V be an inner product space. Show that if w is orthogonal to each of the vectors , , …, , then it is orthogonal to

21. every vector in span .

Let be a basis for an inner product space V. Show that the zero vector is the only vector in V that is
22. orthogonal to all of the basis vectors.

Let be a basis for a subspace W of V. Show that consists of all vectors in V that are orthogonal to
23. every basis vector.

Prove the following generalization of Theorem 6.2.4. If , , …, are pairwise orthogonal vectors in an inner product
24. space V, then

Prove the following parts of Theorem 6.2.2:
25.

(a) part (a)

(b) part (b)

(c) part (c)

Prove the following parts of Theorem 6.2.3:
26.

(a) part (a)

(b) part (b)

(c) part (c)

(d) part (d )

Prove: If u and v are matrices and A is an matrix, then
27.

Use the Cauchy–Schwarz inequality to prove that for all real values of a, b, and ,
28.

Prove: If , , …, are positive real numbers and if and are any two vectors in
29. , then

Show that equality holds in the Cauchy–Schwarz inequality if and only if u and v are linearly dependent.
30.

Use vector methods to prove that a triangle that is inscribed in a circle so that it has a diameter for a side must be a right
31. triangle.

Figure Ex-31
Hint Express the vectors and in the accompanying figure in terms of u and v.

With respect to the Euclidean inner product, the vectors and have norm 2, and the angle

32. between them is 60°. (see the accompanying figure). Find a weighted Euclidean inner product with respect to which u and

v are orthogonal unit vectors.

Figure Ex-32

33. (For Readers Who Have Studied Calculus)
Let and be continuous functions on [0, 1]. Prove:

(a)

(b)

Hint Use the Cauchy–Schwarz inequality.

34. (For Readers Who Have Studied Calculus)
Let have the inner product

and let . Show that if , then and are orthogonal with respect to the given inner
product.

35. in an -coordinate system in . Describe the subspace .
(a) Let W be the line

(b) Let W be the y-axis in an -coordinate system in . Describe the subspace .

(c) Let W be the -plane of an -coordinate system in . Describe the subspace .

Let be a homogeneous system of three equations in the unknowns x, y, and z.
36.

(a) If the solution space is a line through the origin in , what kind of geometric object is
the row space of A? Explain your reasoning.

(b) If the column space of A is a line through the origin, what kind of geometric object is the

solution space of the homogeneous system ? Explain your reasoning.

(c) If the homogeneous system has a unique solution, what can you say about the

row space and column space of A? Explain your reasoning.

Indicate whether each statement is always true or sometimes false. Justify your answer by giving
37. a logical argument or a counterexample.

(a) If V is a subspace of and W is a subspace of V , then is a subspace of .
(b) for all vectors u, v, and w in an inner product space.
(c) If u is in the row space and the nullspace of a square matrix A, then .
(d) If u is in the row space and the column space of an matrix A, then .

Let have the inner product that was defined in Example 7
38.

of Section 6.1. Describe the orthogonal complement of
(a) the subspace of all diagonal matrices
(b) the subspace of symmetric matrices

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.

6.3 In many problems involving vector spaces, the problem solver is free to

ORTHONORMAL BASES; choose any basis for the vector space that seems appropriate. In inner
product spaces, the solution of a problem is often greatly simplified by
GRAM–SCHMIDT choosing a basis in which the vectors are orthogonal to one another. In this
PROCESS; section we shall show how such bases can be obtained.
QR-DECOMPOSITION

DEFINITION

A set of vectors in an inner product space is called an orthogonal set if all pairs of distinct vectors in the set are
orthogonal. An orthogonal set in which each vector has norm 1 is called orthonormal.

EXAMPLE 1 An Orthogonal Set in is orthogonal since
Let
and assume that has the Euclidean inner product. It follows that the set of vectors

.

If v is a nonzero vector in an inner product space, then by part (c) of Theorem 6.2.2, the vector

has norm 1, since

The process of multiplying a nonzero vector v by the reciprocal of its length to obtain a unit vector is called normalizing v.
An orthogonal set of nonzero vectors can always be converted to an orthonormal set by normalizing each of its vectors.

EXAMPLE 2 Constructing an Orthonormal Set
The Euclidean norms of the vectors in Example 1 are
Consequently, normalizing , , and yields

We leave it for you to verify that the set is orthonormal by showing that

In an inner product space, a basis consisting of orthonormal vectors is called an orthonormal basis, and a basis consisting of
orthogonal vectors is called an orthogonal basis. A familiar example of an orthonormal basis is the standard basis for
with the Euclidean inner product:

This is the basis that is associated with rectangular coordinate systems (see Figure 5.4.4). More generally, in with the
Euclidean inner product, the standard basis

is orthonormal.

Coordinates Relative to Orthonormal Bases

The interest in finding orthonormal bases for inner product spaces is motivated in part by the following theorem, which
shows that it is exceptionally simple to express a vector in terms of an orthonormal basis.

THEOREM 6.3.1

If is an orthonormal basis for an inner product space V, and u is any vector in V, then

Proof Since is a basis, a vector u can be expressed in the form

We shall complete the proof by showing that for , …, n. For each vector in S, we
have

Since is an orthonormal set, we have

Therefore, the above expression for simplifies to

Using the terminology and notation introduced in Section 5.4, the scalars

in Theorem 6.3.1 are the coordinates of the vector u relative to the orthonormal basis , and
is the coordinate vector of u relative to this basis.

EXAMPLE 3 Coordinate Vector Relative to an Orthonormal Basis
Let

It is easy to check that is an orthonormal basis for with the Euclidean inner product. Express the vector

as a linear combination of the vectors in S, and find the coordinate vector .

Solution

Therefore, by Theorem 6.3.1 we have
that is,

The coordinate vector of u relative to S is

Remark The usefulness of Theorem 6.3.1 should be evident from this example if we remember that for nonorthonormal
bases, it is usually necessary to solve a system of equations in order to express a vector in terms of the basis.

Orthonormal bases for inner product spaces are convenient because, as the following theorem shows, many familiar formulas
hold for such bases.

THEOREM 6.3.2

If S is an orthonormal basis for an n-dimensional inner product space, and if

then
(a)

(b)

(c)

The proof is left for the exercises.

Remark Observe that the right side of the equality in part (a) is the norm of the coordinate vector with respect to the
Euclidean inner product on , and the right side of the equality in part (c) is the Euclidean inner product of and .
Thus, by working with orthonormal bases, we can reduce the computation of general norms and inner products to the
computation of Euclidean norms and inner products of the coordinate vectors.

EXAMPLE 4 Calculating Norms Using Orthonormal Bases is
If has the Euclidean inner product, then the norm of the vector

However, if we let have the orthonormal basis S in the last example, then we know from that example that the coordinate
vector of u relative to S is

The norm of u can also be calculated from this vector using part (a) of Theorem 6.3.2. This yields

Coordinates Relative to Orthogonal Bases

If is an orthogonal basis for a vector space V, then normalizing each of these vectors yields the
orthonormal basis

Thus, if u is any vector in V, it follows from Theorem 6.3.1 that

which, by part (c) of Theorem 6.1.1, can be rewritten as

(1)

This formula expresses u as a linear combination of the vectors in the orthogonal basis S. Some problems requiring the use of
this formula are given in the exercises.

It is self-evident that if , , and are three nonzero, mutually perpendicular vectors in , then none of these vectors lies
in the same plane as the other two; that is, the vectors are linearly independent. The following theorem generalizes this result.

THEOREM 6.3.3
If is an orthogonal set of nonzero vectors in an inner product space, then S is linearly independent.

Proof Assume that

(2)

To demonstrate that is linearly independent, we must prove that

.
For each in S, it follows from 2 that

or, equivalently,

From the orthogonality of S it follows that when , so this equation reduces to

Since the vectors in S are assumed to be nonzero, by the positivity axiom for inner products. Therefore, .
Since the subscript i is arbitrary, we have ; thus S is linearly independent.

EXAMPLE 5 Using Theorem 6.3.3
In Example 2 we showed that the vectors

form an orthonormal set with respect to the Euclidean inner product on . By Theorem 6.3.3, these vectors form a linearly

independent set, and since is three-dimensional, is an orthonormal basis for by Theorem 5.4.5.

Orthogonal Projections

We shall now develop some results that will help us to construct orthogonal and orthonormal bases for inner product spaces.

In or with the Euclidean inner product, it is evident geometrically that if W is a line or a plane through the origin, then
each vector u in the space can be expressed as a sum

where is in W and is perpendicular to W (Figure 6.3.1). This result is a special case of the following general theorem
whose proof is given at the end of this section.

Figure 6.3.1

THEOREM 6.3.4

Projection Theorem

If W is a finite-dimensional subspace of an inner product space V, then every vector u in V can be expressed in exactly
one way as

(3)

where is in W and is in .

The vector in the preceding theorem is called the orthogonal projection of u on W and is denoted by . The vector

is called the component of u orthogonal to W and is denoted by . Thus Formula 3 in the Projection Theorem can

be expressed as

(4)

Since it follows that

so Formula 4 can also be written as (5)
(Figure 6.3.2).

Figure 6.3.2

The following theorem, whose proof is requested in the exercises, provides formulas for calculating orthogonal projections.
THEOREM 6.3.5

Let W be a finite-dimensional subspace of an inner product space V.

(a) If is an orthonormal basis for W, and u is any vector in V, then

(6)

(b) If is an orthogonal basis for W, and u is any vector in V, then

(7)

EXAMPLE 6 Calculating Projections

Let have the Euclidean inner product, and let W be the subspace spanned by the orthonormal vectors and

. From 6 the orthogonal projection of on W is

The component of u orthogonal to W is

Observe that is orthogonal to both and , so this vector is orthogonal to each vector in the space W spanned by

and , as it should be.

Finding Orthogonal and Orthonormal Bases

We have seen that orthonormal bases exhibit a variety of useful properties. Our next theorem, which is the main result in this
section, shows that every nonzero finite-dimensional vector space has an orthonormal basis. The proof of this result is
extremely important, since it provides an algorithm, or method, for converting an arbitrary basis into an orthonormal basis.

THEOREM 6.3.6

Every nonzero finite-dimensional inner product space has an orthonormal basis.

Proof Let V be any nonzero finite-dimensional inner product space, and suppose that is any basis for V. It

suffices to show that V has an orthogonal basis, since the vectors in the orthogonal basis can be normalized to produce an

orthonormal basis for V. The following sequence of steps will produce an orthogonal basis for V.

Jörgen Pederson Gram (1850–1916) was a Danish actuary. Gram's early education was at village schools supplemented
by private tutoring. After graduating from high school, he obtained a master's degree in mathematics with specialization in
the newly developing modern algebra. Gram then took a position as an actuary for the Hafnia Life Insurance Company,
where he developed mathematical foundations of accident insurance for the company Skjold. He served on the Board of
Directors of Hafnia and directed Skjold until 1910, at which time he became director of the Danish Insurance Board.
During his employ as an actuary, he earned a Ph.D. based on his dissertation “On Series Development Utilizing the Least
Squares Method.” It was in this thesis that his contributions to the Gram– Schmidt process were first formulated. Gram
eventually became interested in abstract number theory and won a gold medal from the Royal Danish Society of Sciences
and Letters for his contributions to that field. However, he also had a lifelong interest in the interplay between theoretical
and applied mathematics that led to four treatises on Danish forest management. Gram was killed one evening in a bicycle
collision on the way to a meeting of the Royal Danish Society.

Step 1. Let .

Step 2. As illustrated in Figure 6.3.3, we can obtain a vector that is orthogonal to by computing the component of
that is orthogonal to the space spanned by . We use Formula 7:

Figure 6.3.3

Of course, if , then is not a basis vector. But this cannot happen, since it would then follow from the preceding

formula for that

The preceding step-by-step construction for converting an arbitrary basis into an orthogonal basis is called the
Gram–Schmidt process.

EXAMPLE 7 Using the Gram–Schmidt Process

Consider the vector space with the Euclidean inner product. Apply the Gram–Schmidt process to transform the basis

vectors , , into an orthogonal basis ; then normalize the orthogonal

basis vectors to obtain an orthonormal basis .

Solution

Step 1.

Step 2.

Step 3.

Thus
form an orthogonal basis for . The norms of these vectors are
so an orthonormal basis for is

Erhardt Schmidt (1876–1959) was a German mathematician. Schmidt received his doctoral degree from Göttingen
University in 1905, where he studied under one of the giants of mathematics, David Hilbert. He eventually went to teach
at Berlin University in 1917, where he stayed for the rest of his life. Schmidt made important contributions to a variety of
mathematical fields but is most noteworthy for fashioning many of Hilbert's diverse ideas into a general concept (called a
Hilbert space), which is fundamental in the study of infinite-dimensional vector spaces. Schmidt first described the
process that bears his name in a paper on integral equations published in 1907.

Remark In the preceding example we used the Gram–Schmidt process to produce an orthogonal basis; then, after the entire
orthogonal basis was obtained, we normalized to obtain an orthonormal basis. Alternatively, one can normalize each
orthogonal basis vector as soon as it is obtained, thereby generating the orthonormal basis step by step. However, this
method has the slight disadvantage of producing more square roots to manipulate.

The Gram–Schmidt process with subsequent normalization not only converts an arbitrary basis into an

orthonormal basis but does it in such a way that for the following relationships hold:

is an orthonormal basis for the space spanned by .

is orthogonal to the space spanned by .

We omit the proofs, but these facts should become evident after some thoughtful examination of the proof of Theorem 6.3.6.

QR-Decomposition

We pose the following problem.

Problem If A is an matrix with linearly independent column vectors, and if Q is the matrix with orthonormal column

vectors that results from applying the Gram–Schmidt process to the column vectors of A, what relationship, if any, exists

between A and Q?

To solve this problem, suppose that the column vectors of A are , , …, and the orthonormal column vectors of Q are
, , …, ; thus

It follows from Theorem 6.3.1 that , , …, are expressible in terms of the vectors , , …, as

Recalling from Section 1.3 that the jth column vector of a matrix product is a linear combination of the column vectors of the
first factor with coefficients coming from the jth column of the second factor, it follows that these relationships can be
expressed in matrix form as

or more briefly as , the vector is orthogonal to , , …, (8)
; thus, all
However, it is a property of the Gram–Schmidt process that for
entries below the main diagonal of R are zero,

(9)

We leave it as an exercise to show that the diagonal entries of R are nonzero, so R is invertible. Thus Equation 8 is a
factorization of A into the product of a matrix Q with orthonormal column vectors and an invertible upper triangular matrix
R. We call Equation 8 the QR-decomposition of A. In summary, we have the following theorem.

THEOREM 6.3.7

QR-Decomposition

If A is an matrix with linearly independent column vectors, then A can be factored as

where Q is an matrix with orthonormal column vectors, and R is an invertible upper

triangular matrix.

Remark Recall from Theorem 6.2.7 that if A is an matrix, then the invertibility of A is equivalent to linear
independence of the column vectors; thus, every invertible matrix has a -decomposition.

EXAMPLE 8 QR-Decomposition of a Matrix
Find the -decomposition of

Solution

The column vectors of A are

Applying the Gram–Schmidt process with subsequent normalization to these column vectors yields the orthonormal vectors
(see Example 7)

and from 9 the matrix R is

Thus the -decomposition of A is

The Role of the QR-Decomposition in Linear Algebra

In recent years the -decomposition has assumed growing importance as the mathematical foundation for a wide variety of
practical numerical algorithms, including a widely used algorithm for computing eigenvalues of large matrices. Such
algorithms are discussed in textbooks that deal with numerical linear algebra.

Additional Proof

Proof of Theorem 6.3.4 There are two parts to the proof. First we must find vectors and with the stated properties,
and then we must show that these are the only such vectors.

By the Gram–Schmidt process, there is an orthonormal basis for W.

Let

(10)
and

(11)

It follows that , so it remains to show that is in W and is orthogonal to W. But lies in

W because it is a linear combination of the basis vectors for W. To show that is orthogonal to W, we must show that

for every vector w in W. But if w is any vector in W, it can be expressed as a linear combination

of the basis vectors , , …, . Thus (12)
But

and by part (c) of Theorem 6.3.2,

Thus and are equal, so 12 yields , which is what we want to show.

To see that 10 and 11 are the only vectors with the properties stated in the theorem, suppose that we can also write

(13)

where is in W and is orthogonal to W. If we subtract from 13 the equation
we obtain

or

Since and (14)
are orthogonal to W, their difference is also orthogonal to W, since for any vector w in W, we can write

But is itself a vector in W, since from 14 it is the difference of the two vectors and that lie in the subspace W.
Thus, must be orthogonal to itself; that is,

But this implies that by Axiom 4 for inner products. Thus , and by 14, .

Exercise Set 6.3

Click here for Just Ask!

Which of the following sets of vectors are orthogonal with respect to the Euclidean inner product on
1.

(a) (0, 1), (2, 0)
(b) ,

(c) ,

(d) (0, 0), (0, 1)

Which of the sets in Exercise 1 are orthonormal with respect to the Euclidean inner product on ?
2.

Which of the following sets of vectors are orthogonal with respect to the Euclidean inner product on ?
3.

(a) ,
,

(b) , ,
(c) , (0, 0, 1)

(1, 0, 0),

(d)
,

Which of the sets in Exercise 3 are orthonormal with respect to the Euclidean inner product on ?
4.

Which of the following sets of polynomials are orthonormal with respect to the inner product on discussed in Example
5. 8 of Section 6.1?

(a) , ,

(b) 1, ,

Which of the following sets of matrices are orthonormal with respect to the inner product on discussed in Example 7
6. of Section 6.1?

(a)

(b)

Verify that the given set of vectors is orthogonal with respect to the Euclidean inner product; then convert it to an
7. orthonormal set by normalizing the vectors.

(a) , (6, 3)

(b) , (2, 0, 2), (0, 5, 0)

(c) , ,

Verify that the set of vectors {(1, 0), (0, 1)} is orthogonal with respect to the inner product on ;
8. then convert it to an orthonormal set by normalizing the vectors.

9. Verify that the vectors , , form an orthonormal basis for with the

Euclidean inner product; then use Theorem 6.3.1 to express each of the following as linear combinations of , , and .

(a)
(b)
(c)

Verify that the vectors
10.

form an orthogonal basis for with the Euclidean inner product; then use Formula 1 to express each of the following
linear combinations of , , , and .

(a) (1, 1, 1, 1)
(b)
(c)

In each part, an orthonormal basis relative to the Euclidean inner product is given. Use Theorem 6.3.1 to find the
11. coordinate vector of w with respect to that basis.

(a) ,
;

(b) ; , ,

12. Let have the Euclidean inner product, and let be the orthonormal basis with ,
.

(a) Find the vectors u and v that have coordinate vectors and .

(b) Compute , , and by applying Theorem 6.3.2 to the coordinate vectors and ; then check

the results by performing the computations directly on u and v.

13. Let have the Euclidean inner product, and let be the orthonormal basis with ,
, and .

(a) Find the vectors u, v, and w that have the coordinate vectors , , and
.

(b) Compute , , and by applying Theorem 6.3.2 to the coordinate vectors , , and ;

then check the results by performing the computations directly on u, v, and w.

In each part, S represents some orthonormal basis for a four-dimensional inner product space. Use the given information
14. to find , , , and .

(a) , ,

(b) , ,

15. , and
(a) Show that the vectors , ,

form an orthogonal basis for with the Euclidean inner product.

(b) Use 1 to express as a linear combination of the vectors in part (a).

Let have the Euclidean inner product. Use the Gram–Schmidt process to transform the basis into an
16. orthonormal basis. Draw both sets of basis vectors in the -plane.

(a) , into an

(b) ,

Let have the Euclidean inner product. Use the Gram–Schmidt process to transform the basis
17. orthonormal basis.

(a) , ,
(b) , ,

Let have the Euclidean inner product. Use the Gram–Schmidt process to transform the basis into an
18. orthonormal basis.

Let have the Euclidean inner product. Find an orthonormal basis for the subspace spanned by (0, 1, 2), (−1, 0, 1), (−1,
19. 1, 3).

Let have the inner product . Use the Gram–Schmidt process to transform
20. , , into an orthonormal basis.

21. The subspace of spanned by the vectors and is a plane passing through the origin.

Express in the form , where lies in the plane and is perpendicular to the plane.

Repeat Exercise 21 with and .
22.

Let have the Euclidean inner product. Express in the form , where is in the space
, and is orthogonal to W.
23. W spanned by and

Find the -decomposition of the matrix, where possible.
24.

(a)

(b)

(c)

(d)

(e)

(f)

Let be an orthonormal basis for an inner product space V. Show that if w is a vector in V, then
25. .

Let be an orthonormal basis for an inner product space V. Show that if w is a vector in V, then
26. .

In Step 3 of the proof of Theorem 6.3.6, it was stated that “the linear independence of ensures that
27. .” Prove this statement.

Prove that the diagonal entries of R in Formula 9 are nonzero.
28.

29. (For Readers Who Have Studied Calculus)
Let the vector space have the inner product

Apply the Gram–Schmidt process to transform the standard basis into an orthonormal basis. (The

polynomials in the resulting basis are called the first three normalized Legendre polynomials.)

30. (For Readers Who Have Studied Calculus)

Use Theorem 6.3.1 to express the following as linear combinations of the first three normalized Legendre polynomials
(Exercise 29).

(a)
(b)
(c)

31. (For Readers Who Have Studied Calculus)
Let have the inner product

Apply the Gram–Schmidt process to transform the standard basis into an orthonormal basis.

Prove Theorem 6.3.2.
32.

Prove Theorem 6.3.5.
33.

34.
(a) It follows from Theorem 6.3.6 that every plane through the origin in must have an

orthonormal basis with respect to the Euclidean inner product. In words, explain how
you would go about finding an orthonormal basis for a plane if you knew its equation.

(b) Use your method to find an orthonormal basis for the plane .

Find vectors x and y in that are orthonormal with respect to the inner product
35. but are not orthonormal with respect to the Euclidean inner product.

If W is a line through the origin of with the Euclidean inner product, and if u is a vector in

36. , then Theorem 6.3.4 implies that u can be expressed uniquely as , where is a

vector in W and is a vector in . Draw a picture that illustrates this.

Indicate whether each statement is always true or sometimes false. Justify your answer by
37. giving a logical argument or a counterexample.

(a) A linearly dependent set of vectors in an inner product space cannot be orthonormal.

(b) Every finite-dimensional vector space has an orthonormal basis.

(c) is orthogonal to in any inner product space.

(d) Every matrix with a nonzero determinant has a -decomposition.

What happens if you apply the Gram–Schmidt process to a linearly dependent set of vectors?
38.

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.

6.4 In this section we shall show how orthogonal projections can be used to
solve certain approximation problems. The results obtained in this section
BEST APPROXIMATION; have a wide variety of applications in both mathematics and science.
LEAST SQUARES

Orthogonal Projections Viewed as Approximations

If P is a point in ordinary 3-space and W is a plane through the origin, then the point Q in W that is closest to P can be

obtained by dropping a perpendicular from P to W (Figure 6.4.1a). Therefore, if we let , then the distance between P

and W is given by

In other words, among all vectors w in W, the vector minimizes the distance (Figure 6.4.1b).

Figure 6.4.1

There is another way of thinking about this idea. View u as a fixed vector that we would like to approximate by a vector in
W. Any such approximation w will result in an “error vector,”

that, unless u is in W, cannot be made equal to . However, by choosing

we can make the length of the error vector

as small as possible. Thus we can describe as the “best approximation” to u by vectors in W. The following theorem
will make these intuitive ideas precise.

THEOREM 6.4.1

Best Approximation Theorem is the best

If W is a finite-dimensional subspace of an inner product space V, and if u is a vector in V, then
approximation to u from W in the sense that

for every vector w in W that is different from .

Proof For every vector w in W, we can write

(1)

But , being a difference of vectors in W, is in W; and is orthogonal to W, so the

two terms on the right side of 1 are orthogonal. Thus, by the Theorem of Pythagoras (Theorem

6.2.4),

If , then the second term in this sum will be positive, so

or, equivalently,

Applications of this theorem will be given later in the text.

Least Squares Solutions of Linear Systems

Up to now we have been concerned primarily with consistent systems of linear equations. However, inconsistent linear

systems are also important in physical applications. It is a common situation that some physical problem leads to a linear

system that should be consistent on theoretical grounds but fails to be so because “measurement errors” in the entries

of A and b perturb the system enough to cause inconsistency. In such situations one looks for a value of x that comes “as

close as possible” to being a solution in the sense that it minimizes the value of with respect to the Euclidean inner

product. The quantity can be viewed as a measure of the “error” that results from regarding x as an approximate

solution of the linear system . If the system is consistent and x is an exact solution, then the error is zero, since

. In general, the larger the value of , the more poorly x serves as an approximate solution of the

system.

Least Squares Problem Given a linear system of m equations in n unknowns, find a vector x, if possible, that

minimizes with respect to the Euclidean inner product on . Such a vector is called a least squares solution of

.

Remark To understand the origin of the term least squares, let , which we can view as the error vector that

results from the approximation x. If , then a least squares solution minimizes

; hence it also minimizes . Hence the term least squares.

To solve the least squares problem, let W be the column space of A. For each matrix x, the product is a linear
combination of the column vectors of A. Thus, as x varies over , the vector varies over all possible linear combinations
of the column vectors of A; that is, varies over the entire column space W. Geometrically, solving the least squares
problem amounts to finding a vector x in such that is the closest vector in W to b (Figure 6.4.2).

Figure 6.4.2
A least squares solution x produces the vector in W closest to b.

It follows from the Best Approximation Theorem (6.4.1) that the closest vector in W to b is the orthogonal projection of b on

W. Thus, for a vector x to be a least squares solution of , this vector must satisfy

(2)

One could attempt to find least squares solutions of by first calculating the vector and then solving 2;

however, there is a better approach. It follows from the Projection Theorem (6.3.4) and Formula 5 of Section 6.3 that

is orthogonal to W. But W is the column space of A, so it follows from Theorem 6.2.6 that lies in the nullspace of .

Therefore, a least squares solution of must satisfy

or, equivalently,

(3)

This is called the normal system associated with , and the individual equations are called the normal equations

associated with . Thus the problem of finding a least squares solution of has been reduced to the problem of

finding an exact solution of the associated normal system.

Note the following observations about the normal system:

The normal system involves n equations in n unknowns (verify).

The normal system is consistent, since it is satisfied by a least squares solution of .

The normal system may have infinitely many solutions, in which case all of its solutions are least squares solutions of
.

From these observations and Formula 2, we have the following theorem.
THEOREM 6.4.2

For any linear system , the associated normal system

is consistent, and all solutions of the normal system are least squares solutions of .
Moreover, if W is the column space of A, and x is any least squares solution of , then the
orthogonal projection of b on W is

Uniqueness of Least Squares Solutions

Before we examine some numerical examples, we shall establish conditions under which a linear system is guaranteed to
have a unique least squares solution. We shall need the following theorem.


Click to View FlipBook Version