The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.
Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by tsegayegeze, 2016-03-16 12:49:22

Elementary_Linear_Algebra_with_Applications_Anton__9_edition

Since and are minimized by the same function g, the preceding problem is equivalent to looking for a

function g in W that is closest to f. But we know from Theorem 6.4.1 that is such a function (Figure 9.4.3). Thus

we have the following result.

Figure 9.4.3

Solution of the Least Squares Approximation Problem If f is a continuous function on , and W is a

finite-dimensional subspace of , then the function g in W that minimizes the mean square error

is , where the orthogonal projection is relative to the inner product

The function is called the least squares approximation to f from W.

Fourier Series

A function of the form

(2)

is called a trigonometric polynomial; if and are not both zero, then is said to have order n. For example,

is a trigonometric polynomial with

The order of is 4.

It is evident from 2 that the trigonometric polynomials of order n or less are the various possible linear combinations of

It can be shown that these functions are linearly independent and that consequently, for any interval (3)
, they form

a basis for a -dimensional subspace of .

Let us now consider the problem of finding the least squares approximation of a continuous function over the interval

by a trigonometric polynomial of order n or less. As noted above, the least squares approximation to f from W is the

orthogonal projection of f on W. To find this orthogonal projection, we must find an orthonormal basis , , …, for W,

after which we can compute the orthogonal projection on W from the formula

(4)

[see Theorem 6.3.5]. An orthonormal basis for W can be obtained by applying the Gram–Schmidt process to the basis 3,
using the inner product

This yields (Exercise 6) the orthonormal basis (5)
If we introduce the notation

then on substituting 5 in 4, we obtain (6)
where (7)

In short, (8)
The numbers , , …, , , …, are called the Fourier coefficients of f.

EXAMPLE 1 Least Squares Approximations

Find the least squares approximation of on by

1. a trigonometric polynomial of order 2 or less; (9a)
2. a trigonometric polynomial of order n or less. (9b)
(9c)
Solution (a)

For , integration by parts yields (verify)

Thus the least squares approximation to x on by a trigonometric polynomial of order 2 or less is
or, from 9a, 9b, and 9c,

Solution (b) by a trigonometric polynomial of order n or less is

The least squares approximation to x on

or, from (9a), (9b), and(9c),

The graphs of and some of these approximations are shown in Figure 9.4.4.

Figure 9.4.4
It is natural to expect that the mean square error will diminish as the number of terms in the least squares approximation

increases. It can be proved that for functions f in , the mean square error approaches zero as ; this is
denoted by writing

The right side of this equation is called the Fourier series for f over the interval . Such series are of major importance
in engineering, science, and mathematics.

Jean Baptiste Joseph Fourier (1768–1830) was a French mathematician and physicist who discovered the Fourier series
and related ideas while working on problems of heat diffusion. This discovery was one of the most influential in the
history of mathematics; it is the cornerstone of many fields of mathematical research and a basic tool in many branches of
engineering. Fourier, a political activist during the French revolution, spent time in jail for his defense of many victims
during the Terror. He later became a favorite of Napoleon and was named a baron.

Exercise Set 9.4

Click here for Just Ask!

Find the least squares approximation of over the interval by
1.

(a) a trigonometric polynomial of order 2 or less

(b) a trigonometric polynomial of order n or less

Find the least squares approximation of over the interval by
2.

(a) a trigonometric polynomial of order 3 or less
(b) a trigonometric polynomial of order n or less

3. .
(a) Find the least squares approximation of x over the interval [0, 1] by a function of the form

(b) Find the mean square error of the approximation.

4. .
(a) Find the least squares approximation of over the interval [0, 1] by a polynomial of the form

(b) Find the mean square error of the approximation.

5. over the interval [−1, 1] by a polynomial of the form
(a) Find the least squares approximation of

.

(b) Find the mean square error of the approximation.

Use the Gram–Schmidt process to obtain the orthonormal basis 5 from the basis 3.
6.

Carry out the integrations in 9a, 9b, and 9c.
7.

Find the Fourier series of over the interval .
8.

Find the Fourier series of , and , over the interval .
9.

What is the Fourier series of ?
10.

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.

9.5 In this section we shall study functions in which the terms are squares of
variables or products of two variables. Such functions arise in a variety of
QUADRATIC FORMS applications, including geometry, vibrations of mechanical systems,
statistics, and electrical engineering.

Quadratic Forms

Up to now, we have been interested primarily in linear equations—that is, in equations of the form

The expression on the left side of this equation,

is a function of n variables, called a linear form. In a linear form, all variables occur to the first power, and there are no
products of variables in the expression. Here, we will be concerned with quadratic forms, which are functions of the form

(1)

For example, the most general quadratic form in the variables and is

(2)

and the most general quadratic form in the variables , , and is

(3)

The terms in a quadratic form that involve products of different variables are called the cross-product terms. Thus, in 2 the
last term is a cross-product term, and in 3 the last three terms are cross-product terms.

If we follow the convention of omitting brackets on the resulting matrices, then 2 can be written in matrix form as

(4)

and 3 can be written as

(5)

(verify by multiplying out). Note that the products in 4 and 5 are both of the form , where x is the column vector of
variables, and A is a symmetric matrix whose diagonal entries are the coefficients of the squared terms and whose entries off
the main diagonal are half the coefficients of the cross-product terms. More precisely, the diagonal entry in row i and column
i is the coefficient of , and the off-diagonal entry in row i and column j is half the coefficient of the product . Here are
some examples.

EXAMPLE 1 Matrix Representation of Quadratic Forms

Symmetric matrices are useful, but not essential, for representing quadratic forms. For example, the quadratic form

, which we represented in Example 1 as with a symmetric matrix A, can also be written as

where the coefficient 6 of the cross-product term has been split as rather than , as in the symmetric representation.

However, symmetric matrices are usually more convenient to work with, so when we write a quadratic form as , it will

always be understood, even if it is not stated explicitly, that A is symmetric. When convenient, we can use Formula 7 of

Section 4.1 to express a quadratic form in terms of the Euclidean inner product as

If preferred, we can use the notation for the dot product and write these expressions as

(6)

Problems Involving Quadratic Forms

The study of quadratic forms is an extensive topic that we can only touch on in this section. The following are some of the
important mathematical problems that involve quadratic forms.

Find the maximum and minimum values of the quadratic form if x is constrained so that

What conditions must A satisfy in order for a quadratic form to satisfy the inequality for all ?

If is a quadratic form in two or three variables and c is a constant, what does the graph of the equation
look like?

If P is an orthogonal matrix, the change of variables converts the quadratic form to

. But is a symmetric matrix if A is (verify), so is a new quadratic form in

the variables of y. It is important to know whether P can be chosen such that this new quadratic form has no
cross-product terms.

In this section we shall study the first two problems, and in the following sections we shall study the last two. The following
theorem provides a solution to the first problem. The proof is deferred to the end of this section.

THEOREM 9.5.1

Let A be a symmetric matrix with eigenvalues . If x is constrained so that , then
(a) .

(b) if x is an eigenvector of A corresponding to and if x is an eigenvector of A
corresponding to .

It follows from this theorem that subject to the constraint

the quadratic form has a maximum value of (the largest eigenvalue) and a minimum value of (the smallest
eigenvalue).

EXAMPLE 2 Consequences of Theorem 9.5.1
Find the maximum and minimum values of the quadratic form

subject to the constraint , and determine values of and at which the maximum and minimum occur.

Solution

The quadratic form can be written as

The characteristic equation of A is

Thus the eigenvalues of A are and , which are the maximum and minimum values, respectively, of the

quadratic form subject to the constraint. To find values of and at which these extreme values occur, we must find

eigenvectors corresponding to these eigenvalues and then normalize these eigenvectors to satisfy the condition .

We leave it for the reader to show that bases for the eigenspaces are

Normalizing these eigenvectors yields

Thus, subject to the constraint , the maximum value of the quadratic form is , which occurs if ,

; and the minimum value is , which occurs if , . Moreover, alternative bases for

the eigenspaces can be obtained by multiplying the basis vectors above by −1. Thus the maximum value, , also occurs if
.
, ; similarly, the minimum value, , also occurs if ,

DEFINITION

A quadratic form is called positive definite if for all , and a symmetric matrix A is called a positive
definite matrix if
is a positive definite quadratic form.

The following theorem is an important result about positive definite matrices.

THEOREM 9.5.2

A symmetric matrix A is positive definite if and only if all the eigenvalues of A are positive.

Proof Assume that A is positive definite, and let be any eigenvalue of A. If x is an eigenvector of A corresponding to ,

then and , so

(7)

where is the Euclidean norm of x. Since it follows that , which is what we wanted to

show. for all . But if , we can
Conversely, assume that all eigenvalues of A are positive. We must show that

normalize x to obtain the vector with the property . It now follows from Theorem 9.5.1 that

where is the smallest eigenvalue of A. Thus,

Multiplying through by yields
which is what we wanted to show.

EXAMPLE 3 Showing That a Matrix Is Positive Definite

In Example 1 of Section 7.3, we showed that the symmetric matrix

has eigenvalues and . Since these are positive, the matrix A is positive definite, and for all ,

Our next objective is to give a criterion that can be used to determine whether a symmetric matrix is positive definite without
finding its eigenvalues. To do this, it will be helpful to introduce some terminology. If

is a square matrix, then the principal submatrices of A are the submatrices formed from the first r rows and r columns of A
for . These submatrices are

THEOREM 9.5.3
A symmetric matrix A is positive definite if and only if the determinant of every principal submatrix is positive.

We omit the proof.

EXAMPLE 4 Working with Principal Submatrices
The matrix

is positive definite since

all of which are positive. Thus we are guaranteed that all eigenvalues of A are positive and for all .

Remark A symmetric matrix A and the quadratic form are called
positive semidefinite if for all x

negative definite if for

negative semidefinite if for all x

indefinite if has both positive and negative values

Theorems Theorem 9.5.2 and Theorem 9.5.3 can be modified in an obvious way to apply to matrices of the first three types.
For example, a symmetric matrix A is positive semidefinite if and only if all of its eigenvalues are nonnegative. Also, A is
positive semidefinite if and only if all its principal submatrices have nonnegative determinants.

Optional

Proof of Theorem 9.5.1a Since A is symmetric, it follows from Theorem 7.3.1 that there is an orthonormal basis for

consisting of eigenvectors of A. Suppose that is such a basis, where is the eigenvector corresponding

to the eigenvalue . If , denotes the Euclidean inner product, then it follows from Theorem 6.3.1 that for any x in ,

Thus

It follows that the coordinate vectors for x and relative to the basis S are

Thus, from Theorem 6.3.2c and the fact that , we obtain

Using these two equations and Formula 6, we can prove that as follows:

The proof that is similar and is left as an exercise.

Proof of Theorem 9.5.1b If x is an eigenvector of A corresponding to and , then

Similarly, if and x is an eigenvector of A corresponding to .

Exercise Set 9.5

Click here for Just Ask!

Which of the following are quadratic forms?
1.

(a)

(b)

(c)

(d)

(e)
(f)

(g)
(h)

Express the following quadratic forms in the matrix notation , where A is a symmetric matrix.
2.

(a)

(b)

(c)

(d)

Express the following quadratic forms in the matrix notation , where A is a symmetric matrix.
3.

(a)

(b)

(c)
(d)

(e)

In each part, find a formula for the quadratic form that does not use matrices.
4.

(a)

(b)

(c)
(d)

(e)

In each part, find the maximum and minimum values of the quadratic form subject to the constraint , and
5. determine the values of and at which the maximum and minimum occur. , and

(a)
(b)
(c)
(d)

In each part, find the maximum and minimum values of the quadratic form subject to the constraint
6. determine the values of , , and at which the maximum and minimum occur.

(a)
(b)
(c)

Use Theorem 9.5.2 to determine which of the following matrices are positive definite.
7.

(a)

(b)

(c)

Use Theorem 9.5.3 to determine which of the matrices in Exercise 7 are positive definite.
8.

Use Theorem 9.5.2 to determine which of the following matrices are positive definite.
9.

(a)

(b)

(c)

Use Theorem 9.5.3 to determine which of the matrices in Exercise 9 are positive definite.
10.

In each part, classify the quadratic form as positive definite, positive semidefinite, negative definite, negative
11. semidefinite, or indefinite.

(a)
(b)
(c)
(d)
(e)
(f)

In each part, classify the matrix as positive definite, positive semidefinite, negative definite, negative semidefinite, or
12. indefinite.

(a)

(b)
(c)
(d)
(e)
(f)

Let be a quadratic form in , , …, and define by .
13.

(a) Show that .

(b) Show that .

(c) Is T a linear transformation? Explain.

In each part, find all values of k for which the quadratic form is positive definite.
14.

(a)

(b)

(c)

Express the quadratic form in the matrix notation , where A is symmetric.
15.

Let . In statistics, the quantity
16.

is called the sample mean of , , …, , and

is called the sample variance.

(a) Express the quadratic form in the matrix notation , where A is symmetric.
(b) Is a positive definite quadratic form? Explain.

Complete the proof of Theorem 9.5.1 by showing that if and if x is an eigenvector of A
17. corresponding to .

Indicate whether each statement is true (T) or false (F). Justify your answer.
18.

(a) A symmetric matrix with positive entries is positive definite.

(b) is a quadratic form.

(c) is a quadratic form.

(d) A positive definite matrix is invertible.

(e) A symmetric matrix is positive definite, negative definite, or indefinite.

(f) If A is positive definite, then is negative definite.

Indicate whether each statement is true (T) or false (F). Justify your answer.
19.

(a) If x is a vector in , then is a quadratic form.

(b) If is a positive definite quadratic form, then so is .

(c) If A is a matrix with positive eigenvalues, then is a positive definite quadratic
form.

(d) If A is a symmetric matrix with positive entries and a positive determinant, then A
is positive definite.

(e) If is a quadratic form with no cross-product terms, then A is a diagonal matrix.

(f) If is a positive definite quadratic form in x and y, and if , then the graph of

the equation is an ellipse.

What property must a symmetric matrix A have for to represent a circle?
20.

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.

9.6 In this section we shall show how to remove the cross-product terms from a
quadratic form by changing variables, and we shall use our results to study
DIAGONALIZING the graphs of the conic sections.
QUADRATIC FORMS;
CONIC SECTIONS

Diagonalization of Quadratic Forms

Let

(1)

be a quadratic form, where A is a symmetric matrix. We know from Theorem 7.3.1 that there is an orthogonal matrix P that
diagonalizes A; that is,

where , , …, are the eigenvalues of A. If we let

and make the substitution in 1, then we obtain
But

which is a quadratic form with no cross-product terms.
In summary, we have the following result.

THEOREM 9.6.1

Let be a quadratic form in the variables , , …, , where A is symmetric. If P orthogonally diagonalizes A, and if

the new variables , , … are defined by the equation , then substituting this equation in yields

where , , …, are the eigenvalues of A and

The matrix P in this theorem is said to orthogonally diagonalize the quadratic form or reduce the quadratic form to a sum of
squares.

EXAMPLE 1 Reducing a Quadratic Form to a Sum of Squares to a sum of squares, and express the
Find a change of variables that will reduce the quadratic form
quadratic form in terms of the new variables.

Solution

The quadratic form can be written as

The characteristic equation of the matrix is

so the eigenvalues are , , . We leave it for the reader to show that orthonormal bases for the three
eigenspaces are

Thus, a substitution that eliminates cross-product terms is

or, equivalently,
The new quadratic form is

or, equivalently,

Remark There are other methods for eliminating the cross-product terms from a quadratic form; we shall not discuss them
here. Two such methods, Lagrange's reduction and Kronecker's reduction, are discussed in more advanced books.

Conic Sections

We shall now apply our work on quadratic forms to the study of equations of the form
(2)

where a, b, …, f are real numbers, and at least one of the numbers a, b, c is not zero. An equation of this type is called a
quadratic equation in x and y, and
is called the associated quadratic form.

EXAMPLE 2 Coefficients in a Quadratic Equation
In the quadratic equation
the constants in 2 are

EXAMPLE 3 Examples of Associated Quadratic Forms

Quadratic Equation Associated Quadratic Form

Graphs of quadratic equations in x and y are called conics or conic sections. The most important conics are ellipses, circles,
hyperbolas, and parabolas; these are called the nondegenerate conics. The remaining conics are called degenerate and include

single points and pairs of lines (see Exercise 15).
A nondegenerate conic is said to be in standard position relative to the coordinate axes if its equation can be expressed in one
of the forms given in Figure 9.6.1.

Figure 9.6.1

EXAMPLE 4 Three Conics
From Figure 9.6.1, the equation

matches the form of an ellipse with and . Thus the ellipse is in standard position, intersecting the x-axis at (−2, 0)

and (2, 0) and intersecting the y-axis at (0, −3) and (0, 3).

The equation can be rewritten as , which is of the form with ,

. Its graph is thus a hyperbola in standard position intersecting the y-axis at and .

The equation can be rewritten as , which is of the form with . Since , its graph

is a parabola in standard position opening downward.

Significance of the Cross-Product Term

Observe that no conic in standard position has an -term (that is, a cross-product term) in its equation; the presence of an
-term in the equation of a nondegenerate conic indicates that the conic is rotated out of standard position (Figure 9.6.2a).Also,
no conic in standard position has both an and an x term or both a and a y term. If there is no cross-product term, the
occurrence of either of these pairs in the equation of a nondegenerate conic indicates that the conic is translated out of standard
position (Figure 9.6.2b). The occurrence of either of these pairs and a cross-product term usually indicates that the conic is both
rotated and translated out of standard position (Figure 9.6.2c).

Figure 9.6.2

One technique for identifying the graph of a nondegenerate conic that is not in standard position consists of rotating and
translating the -coordinate axes to obtain an -coordinate system relative to which the conic is in standard position. Once
this is done, the equation of the conic in the -system will have one of the forms given in Figure 9.6.1 and can then easily be
identified.

EXAMPLE 5 Completing the Square and Translating

Since the quadratic equation

contains -, x-, -, and y-terms but no cross-product term, its graph is a conic that is translated out of standard position but
not rotated. This conic can be brought into standard position by suitably translating coordinate axes. To do this, first collect
x-terms and y-terms. This yields

By completing the squares* on the two expressions in parentheses, we obtain (3)
or

If we translate the coordinate axes by means of the translation equations
then 3 becomes
which is the equation of an ellipse in standard position in the -system. This ellipse is sketched in Figure 9.6.3.

Figure 9.6.3

Eliminating the Cross-Product Term matrices,

We shall now show how to identify conics that are rotated out of standard position. If we omit the brackets on
then 2 can be written in the matrix form

or
where

Now consider a conic C whose equation in -coordinates is

We would like to rotate the -coordinate axes so that the equation of the conic in the new (4)
cross-product term. This can be done as follows. -coordinate system has no

Step 1. Find a matrix

that orthogonally diagonalizes the matrix A. . This ensures that the orthogonal coordinate

Step 2. Interchange the columns of P, if necessary, to make
transformation

(5)

is a rotation. -system, substitute 5 into 4. This yields
Step 3. To obtain the equation for C in the

or (6)
Since P orthogonally diagonalizes A,

where and are eigenvalues of A. Thus 6 can be rewritten as

or

(where and ). This equation has no cross-product term.

The following theorem summarizes this discussion.

THEOREM 9.6.2

Principal Axes Theorem for
Let

be the equation of a conic C, and let

be the associated quadratic form. Then the coordinate axes can be rotated so that the equation
for C in the new -coordinate system has the form

where and are the eigenvalues of A. The rotation can be accomplished by the substitution

where P orthogonally diagonalizes A and .

EXAMPLE 6 Eliminating the Cross-Product Term

Describe the conic C whose equation is .

Solution

The matrix form of this equation is

(7)

where

The characteristic equation of A is

so the eigenvalues of A are and . We leave it for the reader to show that orthonormal bases for the eigenspaces are

Thus

orthogonally diagonalizes A. Moreover, , and thus the orthogonal coordinate transformation
is a rotation. Substituting 8 into 7 yields
Since (8)
this equation can be written as

or

which is the equation of the ellipse sketched in Figure 9.6.4. In that figure, the vectors and are the column vectors of
P—that is, the eigenvectors of A.

Figure 9.6.4

EXAMPLE 7 Eliminating the Cross-Product Term Plus Translation (9)
Describe the conic C whose equation is (10)

Solution

The matrix form of this equation is

where
As shown in Example 6,

orthogonally diagonalizes A and has determinant 1. Substituting into 9 gives
or

Since

10 can be written as

To bring the conic into standard position, the (11)
axes must be translated. Proceeding as in Example 5, we rewrite 11 as

Completing the squares yields (12)
or

If we translate the coordinate axes by means of the translation equations
then 12 becomes

which is the equation of the ellipse sketched in Figure 9.6.5. In that figure, the vectors and are the column vectors of P
—that is, the eigenvectors of A.

Figure 9.6.5

Exercise Set 9.6

Click here for Just Ask!
In each part, find a change of variables that reduces the quadratic form to a sum or difference of squares, and express the
1. quadratic form in terms of the new variables.

(a)
(b)

(c)

(d)

In each part, find a change of variables that reduces the quadratic form to a sum or difference of squares, and express the
2. quadratic form in terms of the new variables.

(a)

(b)

(c)

(d)

Find the quadratic forms associated with the following quadratic equations.
3.

(a)

(b)

(c)

(d)

(e)

Find the matrices of the quadratic forms in Exercise 3.
4.

Express each of the quadratic equations in Exercise 3 in the matrix form
5.

Name the following conics.
6.

(a)

(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)

In each part, a translation will put the conic in standard position. Name the conic and give its equation in the translated
7. coordinate system.

(a)
(b)
(c)
(d)
(e)
(f)

The following nondegenerate conics are rotated out of standard position. In each part, rotate the coordinate axes to remove
8. the -term. Name the conic and give its equation in the rotated coordinate system.

(a)

(b)

(c)
In Exercises 9–14 translate and rotate the coordinate axes, if necessary, to put the conic in standard position. Name the conic
and give its equation in the final coordinate system.
9.

10.

11.

12.

13.

14.
The graph of a quadratic equation in x and y can, in certain cases, be a point, a line, or a pair of lines. These are called

15. degenerate conics. It is also possible that the equation is not satisfied by any real values of x and y. In such cases the
equation has no graph; it is said to represent an imaginary conic. Each of the following represents a degenerate or
imaginary conic. Where possible, sketch the graph.

(a)

(b)

(c)

(d)

(e) by rotating the
(f)
Prove: If , then the cross-product term can be eliminated from the quadratic form
16. coordinate axes through an angle θ that satisfies the equation

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.

9.7 In this section we shall apply the diagonalization techniques developed in
the preceding section to quadratic equations in three variables, and we
QUADRIC SURFACES shall use our results to study quadric surfaces.

In Section 9.6 we looked at quadratic equations in two variables. (1)

Quadric Surfaces

An equation of the form

where a, b, …, f are not all zero, is called a quadratic equation in x, y, and z; the expression
is called the associated quadratic form, which now involves three variables: x, y, and z.
Equation 1 can be written in the matrix form

or
where

EXAMPLE 1 Associated Quadratic Form

The quadratic form associated with the quadratic equation

is

Graphs of quadratic equations in x, y, and z are called quadrics or quadric surfaces. The simplest equations for quadric
surfaces occur when those surfaces are placed in certain standard positions relative to the coordinate axes. Figure 9.7.1
shows the six basic quadric surfaces and the equations for those surfaces when the surfaces are in the standard positions
shown in the figure. If a quadric surface is cut by a plane, then the curve of intersection is called the trace of the plane on the
surface. To help visualize the quadric surfaces in Figure 9.7.1, we have shown and described the traces made by planes
parallel to the coordinate planes. The presence of one or more of the cross-product terms , , and in the equation of a

quadric indicates that the quadric is rotated out of standard position; the presence of both and x terms, and y terms, or
and z terms in a quadric with no cross-product term indicates the quadric is translated out of standard position.

Figure 9.7.1
EXAMPLE 2 Identifying a Quadric Surface
Describe the quadric surface whose equation is

Solution

Rearranging terms gives
Completing the squares yields
or
or

Translating the axes by means of the translation equations
yields

which is the equation of a hyperboloid of one sheet.

Eliminating Cross-Product Terms

The procedure for identifying quadrics that are rotated out of standard position is similar to the procedure for conics. Let Q
be a quadric surface whose equation in -coordinates is

We want to rotate the -coordinate axes so that the equation of the quadric in the new (2)
cross-product terms. This can be done as follows: -coordinate system has no

Step 1. Find a matrix P that orthogonally diagonalizes .

Step 2. Interchange two columns of P, if necessary, to make det . This ensures that the orthogonal coordinate
transformation

is a rotation. (3)
-coordinates with no cross-product
Step 3. Substitute 3 into 2. This will produce an equation for the quadric in
terms. (The proof is similar to that for conics and is left as an exercise.)

The following theorem summarizes this discussion.

THEOREM 9.7.1

Principal Axes Theorem for
Let

be the equation of a quadric Q,and let

be the associated quadratic form. The coordinate axes can be rotated so that the equation of Q

in the -coordinate system has the form

where , , and are the eigenvalues of A. The rotation can be accomplished by the
substitution

where P orthogonally diagonalizes A and det .

EXAMPLE 3 Eliminating Cross-Product Terms
Describe the quadric surface whose equation is

Solution (4)

The matrix form of the above quadratic equation is

where

As shown in Example 1 of Section 7.3, the eigenvalues of A are and , and A is orthogonally diagonalized by the
matrix

where the first two column vectors in P are eigenvectors corresponding to , and the third column vector is an
eigenvector corresponding to .

Since (verify), the orthogonal coordinate transformation is a rotation. Substituting this expression in 4
yields

or, equivalently, (5)

But

so 5 becomes

or
This can be rewritten as
which is the equation of an ellipsoid.

Exercise Set 9.7

Click here for Just Ask!
Find the quadratic forms associated with the following quadratic equations.
1.

(a)
(b)
(c)
(d)
(e)
(f)

Find the matrices of the quadratic forms in Exercise 1. .
2.

Express each of the quadratic equations given in Exercise 1 in the matrix form
3.

Name the following quadrics.
4.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

In Exercise 4, identify the trace in the plane in each case.
5.

Find the matrices of the quadratic forms in Exercise 4. Express each of the quadratic equations in the matrix form
6. .

In each part, determine the translation equations that will put the quadric in standard position, and find the equation of th
7. quadric in the translated coordinate system. Name the quadric.

(a)

(b)

(c)

(d)

(e)
(f)
(g)

In each part, find a rotation that removes the cross-product terms, and give its equation in the -system.
8. Name the quadric.

(a)

(b)

(c)

(d)

In Exercises 7–10 translate and rotate the coordinate axes to put the quadric in standard position. Name the quadric and give
its equation in the final coordinate system.

9.

10.

11.

12.

Prove Theorem 9.7.1.
13.

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.

9.8 In this section we shall discuss some practical aspects of solving systems of
linear equations, inverting matrices, and finding eigenvalues. Although we have
COMPARISON OF previously discussed methods for performing these computations, we now
PROCEDURES FOR consider their suitability for the computer solution of the large-scale problems
SOLVING LINEAR that arise in real-world applications.
SYSTEMS

Counting Operations

Since computers are limited in the number of decimal places they can carry, they round off or truncate most numerical quantities.
For example, a computer designed to store eight decimal places might record as either .66666667 (rounded off) or .66666666
(truncated).* In either case, an error is introduced that we shall call roundoff error or rounding error.

The main practical considerations in solving linear algebra problems on digital computers are minimizing the computer time (and
thus cost) needed to obtain the solution, and minimizing inaccuracies due to roundoff errors. Thus, a good computer algorithm
uses as few operations and memory accesses as possible, and performs the operations in a way that minimizes the effect of
roundoff errors.

In this text we have studied four methods for solving a linear system, , of n equations in n unknowns:

1. Gaussian elimination with back-substitution

2. Gauss–Jordan elimination

3. Computing , then forming

4. Cramer's rule

To determine how these methods compare as computational tools, we need to know how many arithmetic operations each requires.
It is usual to group divisions and multiplications together and to group additions and subtractions together. Divisions and
multiplications are considerably slower than additions and subtractions, in general. We shall refer to either multiplications or
divisions as “multiplications” and to additions or subtractions as “additions.”

In Table 1 we list the number of operations required to solve a linear system of n equations in n unknowns by each of the

four methods discussed in the text, as well as the number of operations required to invert A or to compute its determinant by row

reduction.

Table 1 Matrix A

Operation Counts for an Invertible

Method Number of Additions Number of Multiplications

Solve by Gauss– Jordan elimination

Method Number of Additions Number of Multiplications

Solve by Gaussian elimination

Find by reducing to

Solve as

Find det(A) by row reduction

Solve by Cramer's rule

Note that the text methods of Gauss–Jordan elimination and Gaussian elimination have the same operation counts. It is not hard to
see why this is so. Both methods begin by reducing the augmented matrix to row-echelon form. This is called the forward phase or
forward pass. Then the solution is completed by back-substitution in Gaussian elimination and by continued reduction to reduced
row-echelon form in Gauss–Jordan elimination. This is called the backward phase or backward pass. It turns out that the number
of operations required for the backward phase is the same whether one uses back-substitution or continued reduction to reduced
row-echelon form. Thus the text method of Gaussian elimination and the text method of Gauss–Jordan elimination have the same
operation counts.

Remark There is a common variation of Gauss–Jordan elimination that is less efficient than the one presented in this text. In our
method the augmented matrix is first reduced to reduced row-echelon form by introducing zeros below the leading 1's; then the
reduction is completed by introducing zeros above the leading 1's. An alternative procedure is to introduce zeros above and below
a leading 1 as soon as it is obtained. This method requires

both of which are larger than our values for all .

To illustrate how the results in Table 1 are computed, we shall derive the operation counts for Gauss–Jordan elimination. For this
discussion we need the following formulas for the sum of the first n positive integers and the sum of the squares of the first n
positive integers:

(1)

(2)

Derivations of these formulas are discussed in the exercises. We also need formulas for the sum of the first positive integers

and the sum of the squares of the first positive integers. These can be obtained by substituting for n in 1 and 2.

(3)

(4)

Operation Count for Gauss–Jordan Elimination

Let be a system of n linear equations in n unknowns, and assume that A is invertible, so that the system has a unique

solution. Also assume, for simplicity, that no row interchanges are required to put the augmented matrix in reduced

row-echelon form. This assumption is justified by the fact that row interchanges are performed as bookkeeping operations on a
computer (that is, they are simulated, not actually performed) and so require much less time than arithmetic operations.

Since no row interchanges are required, the first step in the Gauss–Jordan elimination process is to introduce a leading 1 in the first
row by multiplying the elements in that row by the reciprocal of the leftmost entry in the row. We shall represent this step
schematically as follows:

Note that the leading 1 is simply recorded and requires no computation; only the remaining n entries in the first row must be
computed.

The following is a schematic description of the steps and the number of operations required to reduce to row-echelon form.

Step 1.

Step 1a.

Step 2.

Step 2a.

Step 3.
Step 3a.
Step .
Step a.
Step n.

Thus, the number of operations required to complete successive steps is as follows:
Steps 1 and 1a.

Steps 2 and 2a.

Steps 3 and 3a.

Steps and a.

Step n.

Therefore, the total number of operations required to reduce to row-echelon form is

or, on applying Formulas 1 and 2,

(5)

(6)

This completes the operation count for the forward phase. For the backward phase we must put the row-echelon form of
into reduced row-echelon form by introducing zeros above the leading 1's. The operations are as follows:

Step 1. (7)
(8)
Step 2.

Step .

Step .
Thus, the number of operations required for the backward phase is
or, on applying Formula 3,
Thus, from 5, 6, 7, and 8, the total operation count for Gauss–Jordan elimination is

(9)

(10)

Comparison of Methods for Solving Linear Systems

In practical applications it is common to encounter linear systems with thousands of equations in thousands of unknowns. Thus we

shall be interested in Table 1 for large values of n. It is a fact about polynomials that for large values of the variable, a polynomial

can be approximated well by its term of highest degree; that is, if , then

(Exercise 12). Thus, for large values of n, the operation counts in Table 1 can be approximated as shown in Table 2.

Table 2 Matrix A for Large n

Approximate Operation Counts for an Invertible

Method Number of Additions Number of Multiplications

Solve by Gauss–Jordan elimination

Solve by Gaussian elimination

Find by reducing to

Solve as

Find by row reduction

Solve by Cramer's rule

It follows from Table 2 that for large n, the best of these methods for solving are Gaussian elimination and Gauss–Jordan

elimination. The method of multiplying by is much worse than these (it requires three times as many operations), and the

poorest of the four methods is Cramer's rule.

Remark

We observed in the remark following Table 1 that if Gauss–Jordan elimination is performed by introducing zeros above and
below leading 1's as soon as they are obtained, then the operation count is

Thus, for large n, this procedure requires multiplications, which is 50% greater than the multiplications required by
the text method. Similarly for additions.

It is reasonable to ask if it is possible to devise other methods for solving linear systems that might require significantly fewer than

the additions and multiplications needed in Gaussian elimination and Gauss–Jordan elimination. The answer is a qualified

“yes.” In recent years, methods have been devised that require multiplications, where q is slightly larger than 2.3. However,

these methods have little practical value because the programming is complicated, the constant C is very large, and the number of

additions required is excessive. In short, there is currently no practical method for the direct solution of general linear systems that
significantly improves on the operation counts for Gaussian elimination and the text method of Gauss–Jordan elimination.

Operation counts are not the only criterion by which to judge a method for the computer solution of a linear system. As the speed
of computers has increased, the time it takes to move entries of the matrix from memory to the processing unit has become

increasingly important. For very large matrices, the time for memory accesses greatly exceeds the time required to do the actual

computations! Despite this, the conclusion above still stands: Except for extremely large matrices, Gaussian elimination or a

variant thereof is nearly always the method of choice for solving . It is almost never necessary to compute , and we

should avoid doing so whenever possible. Solving by Cramer's rule would be senseless for numerical purposes, despite its

theoretical value.

EXAMPLE 1 Avoiding the Inverse . The result is a vector y. Rather than computing as given, it

Suppose we needed to compute the product
would be more efficient to write this as

that is, as

and to compute the result as follows: First, compute the vector ; second, solve for z using Gaussian elimination;

third, compute the vector .

For extremely large matrices, such as the ones that occur in numerical weather prediction, approximate methods for solving
are often employed. In such cases, the matrix is typically sparse; that is, it has very few nonzero entries. These techniques are
beyond the scope of this text.

Exercise Set 9.8

Click here for Just Ask!

Find the number of additions and multiplications required to compute if A is an matrix and B is an matrix.
1.

Use the result in Exercise 1 to find the number of additions and multiplications required to compute by direct multiplication

2. if A is an matrix.

Assuming A to be an matrix, use the formulas in Table 1 to determine the number of operations required for the
3. procedures in Table 3.

Table 3

+×+ × + × + ×

Solve by Gauss– Jordan

elimination

Solve by Gaussian

elimination

Find by reducing to

Solve as
Find by row reduction
Solve
by Cramer's rule

Assuming for simplicity a computer execution time of 2.0 microseconds for multiplications and 0.5 microsecond for additions,
4. use the results in Exercise 3 to fill in the execution times in seconds for the procedures in Table 4.

Table 4

Execution Time Execution Time Execution Time Execution Time
(sec) (sec) (sec) (sec)

Solve by Gauss– Jordan

elimination

Solve by Gaussian

elimination

Find by reducing to

Solve as
Find by row reduction
Solve
by Cramer's rule

Derive the formula
5.

Hint Let . Write the terms of in reverse order and add the two expressions for .

Use the result in Exercise 5 to show that
6.

Derive the formula .
7.

using the following steps.

(a) Show that

(b) Show that

(c) Apply (a) to each term on the left side of (b) to show that

(d) Solve the equation in (c) for , use the result of Exercise 5, and then simplify.

Use the result in Exercise 7 to show that
8.

Let R be a row-echelon form of an invertible matrix. Show that solving the linear system by back-substitution
9. requires

Show that to reduce an invertible matrix to by the text method requires
10.

Note Assume that no row interchanges are required.

Consider the variation of Gauss–Jordan elimination in which zeros are introduced above and below a leading 1 as soon as it is

11. obtained, and let A be an invertible matrix. Show that to solve a linear system using this version of Gauss–Jordan

elimination requires

Note Assume that no row interchanges are required. , where , then
(For Readers Who Have Studied Calculus) Show that if
12. for large values of x.

This result justifies the approximation

13. an even less efficient way to find y in Example 1?
(a) Why is

(b) Use the result of Exercise 1 to find the operation count for this approach and for .

Copyright © 2005 John Wiley & Sons, Inc. All rights reserved.


Click to View FlipBook Version