282 CHAPTER 8 Curve Fitting, Regression, and Correlation
8.12. Work Problem 8.11 by using the method of Problem 8.9.
Subtract an appropriate value, say, 68, from x and y (the numbers subtracted from x and from y could be
different). This leads to Table 8-5.
From the table we find
n axryr Ϫ Q axrR Q ayrR (12)(47) Ϫ (Ϫ16)(Ϫ5)
bϭ ϭ (12)(106) Ϫ (16)2 ϭ 0.476
2
n axr2 Ϫ Q axrR
Also since xr ϭ x Ϫ 68, yr ϭ y Ϫ 68, we have x#r ϭ x# Ϫ 68, y#r ϭ y# Ϫ 68. Thus
x# ϭ x#r ϩ 68 ϭ Ϫ1126 ϩ 68 ϭ 66.67, y# ϭ y#r ϩ 68 ϭ Ϫ152 ϩ 68 ϭ 67.58
The required regression equation of y on x is y Ϫ y# ϭ b(x Ϫ x#), i.e.,
y Ϫ 67.58 ϭ 0.476(x Ϫ 66.07) or y ϭ 35.85 ϩ 0.476x
in agreement with Problem 8.11, apart from rounding errors. In a similar manner we can obtain the regression
equation of x on y.
xr yr Table 8-5 xryr yr2
Ϫ3 0 0
Ϫ5 0 xr2 4
Ϫ1 Ϫ2 9 10 0
Ϫ4 25 0 9
0 1 1
0 Ϫ3 16 12 4
Ϫ6 0 0 0
1 36 9
2 Ϫ2 4 12 9
Ϫ2 4 0 1
0 0 6 0
0 Ϫ3 1 0 4
Ϫ1 1 1
3 9 0 gyr2 ϭ 41
1 Ϫ1 6
3 gxr2 ϭ 106
0 gxryr ϭ 47
gxr ϭ Ϫ16 2
gyr ϭ Ϫ5
Nonlinear equations reducible to linear form
8.13. Table 8-6 gives experimental values of the pressure P of a given mass of gas corresponding to various val-
ues of the volume V. According to thermodynamic principles, a relationship having the form
PVg ϭ C, where g and C are constants, should exist between the variables. (a) Find the values of g and
C. (b) Write the equation connecting P and V. (c) Estimate P when V ϭ 100.0 in3.
Table 8-6
Volume V (in3) 54.3 61.8 72.4 88.7 118.6 194.0
Pressure P (lb>in2) 61.2 49.5 37.6 28.4 19.2 10.1
Since PV g ϭ C, we have upon taking logarithms to base 10,
log P ϩ g log V ϭ log C or log P ϭ log C Ϫ g log V
CHAPTER 8 Curve Fitting, Regression, and Correlation 283
Setting log V ϭ x and log P ϭ y, the last equation can be written
(1) y ϭ a ϩ bx
where a ϭ log C and b ϭ Ϫg.
Table 8-7 gives the values of x and y corresponding to the values of V and P in Table 8-6 and also indicates
the calculations involved in computing the least-squares line (1).
Table 8-7
x ϭ log V y ϭ log P x2 xy
1.7348 1.7868 3.0095 3.0997
1.7910 1.6946 3.2077 3.0350
1.8597 1.5752 3.4585 2.9294
1.9479 1.4533 3.7943 2.8309
2.0741 1.2833 4.3019 2.6617
2.2878 1.0043 5.2340 2.2976
gx ϭ 11.6953 gy ϭ 8.7975 gx2 ϭ 23.0059 gxy ϭ 16.8543
The normal equations corresponding to the least-squares line (1) are
ay ϭ an ϩ b ax axy ϭ a ax ϩ b ax2
from which
Q ayR Q ax2R Ϫ Q axR Q axyR n axy Ϫ Q axR Q ayR
a ϭ 2 ϭ 4.20, b ϭ 2 ϭ Ϫ1.40
n ax2 Ϫ Q axR n ax2 Ϫ Q axR
Then y ϭ 4.20 Ϫ 1.40x.
(a) Since a ϭ 4.20 ϭ log C and b ϭ Ϫ1.40 ϭ Ϫg, C ϭ 1.60 ϫ 104 and g ϭ 1.40.
(b) PV1.40 ϭ 16,000.
(c) When V ϭ 100, x ϭ log V ϭ 2 and y ϭ log P ϭ 4.20 Ϫ 1.40(2) ϭ 1.40. Then P ϭ antilog 1.40 ϭ
25.1 lb>in2.
8.14. Solve Problem 8.13 by plotting the data on log-log graph paper.
For each pair of values of the pressure P and volume V in Table 8-6, we obtain a point that is plotted on the
specially constructed log-log graph paper shown in Fig. 8-10.
A line (drawn freehand) approximating these points is also indicated. The resulting graph shows that there is
a linear relationship between log P and log V, which can be represented by the equation
log P ϭ a ϩ b log V or y ϭ a ϩ bx
The slope b, which is negative in this case, is given numerically by the ratio of the length of AB to the length
of AC. Measurement in this case yields b ϭ Ϫ1.4.
To obtain a, one point on the line is needed. For example, when V ϭ 100, P ϭ 25 from the graph. Then
a ϭ log P Ϫ b log V ϭ log 25 ϩ 1.4 log 100 ϭ 1.4 ϩ (1.4)(2) ϭ 4.2
so that
log P ϩ 1.4 log V ϭ 4.2, log PV1.4 ϭ 4.2, and PV1.4 ϭ 16,000
284 CHAPTER 8 Curve Fitting, Regression, and Correlation
Fig. 8-10
The least-squares parabola
8.15. Derive the normal equations (19), page 269, for the least-squares parabola.
y ϭ a ϩ bx ϩ cx2
Let the sample points be (x1, y1), (x2, y2), . . . , (xn, yn). Then the values of y on the least-squares parabola
corresponding to x1, x2, . . . , xn are
a ϩ bx1 ϩ cx12, a ϩ bx2 ϩ cx22, c, a ϩ bxn ϩ cx2n
Therefore, the deviations from y1, y2, . . . , yn are given by
d1 ϭ a ϩ bx1 ϩ cx12 Ϫ y1, d2 ϭ a ϩ bx2 ϩ cx22 Ϫ y2, c, dn ϭ a ϩ bxn ϩ cx2n Ϫ yn
and the sum of the squares of the deviations is given by
ad 2 ϭ a(a ϩ bx ϩ cx2 Ϫ y)2
This is a function of a, b, and c, i.e.,
F(a, b, c) ϭ a(a ϩ bx ϩ cx2 Ϫ y)2
To minimize this function, we must have
'F ϭ 0, 'F ϭ 0, 'F ϭ 0
'a 'b 'c
Now 'F ϭ a ' (a ϩ bx ϩ cx2 Ϫ y)2 ϭ a 2(a ϩ bx ϩ cx2 Ϫ y)
'a 'a
'F ϭ a ' (a ϩ bx ϩ cx2 Ϫ y)2 ϭ a 2x(a ϩ bx ϩ cx2 Ϫ y)
'b 'b
'F ϭ a ' (a ϩ bx ϩ cx2 Ϫ y)2 ϭ a 2x2(a ϩ bx ϩ cx2 Ϫ y)
'c 'c
Simplifying each of these summations and setting them equal to zero yields the equations (19), page 269.
CHAPTER 8 Curve Fitting, Regression, and Correlation 285
8.16. Fit a least-squares parabola having the form y ϭ a ϩ bx ϩ cx2 to the data in Table 8-8.
Table 8-8
x 1.2 1.8 3.1 4.9 5.7 7.1 8.6 9.8
y 4.5 5.9 7.0 7.8 7.2 6.8 4.5 2.7
Then normal equations are
ay ϭ an ϩ b ax ϩ c ax2
(1) axy ϭ a ax ϩ b ax2 ϩ c ax3
ax2y ϭ a ax2 ϩ b ax3 ϩ c ax4
The work involved in computing the sums can be arranged as in Table 8-9.
Table 8-9
x y x2 x3 x4 xy x2y
1.2 4.5 1.44 1.73 2.08 5.40 6.48
1.8 5.9 3.24 5.83 10.49 10.62 19.12
3.1 7.0 9.61 29.79 92.35 21.70 67.27
4.9 7.8 24.01 117.65 576.48 38.22 187.28
5.7 7.2 32.49 185.19 1055.58 41.04 233.93
7.1 6.8 50.41 357.91 2541.16 48.28 342.79
8.6 4.5 73.96 636.06 5470.12 38.70 332.82
9.8 2.7 96.04 941.19 9223.66 26.46 259.31
gx ϭ gy ϭ gx2 ϭ gx3 ϭ gx4 ϭ gxy ϭ gx2y ϭ
42.2 46.4 291.20 2275.35 18,971.92 230.42 1449.00
Then the normal equations (1) become, since n ϭ 8,
8a ϩ 42.2b ϩ 291.20c ϭ 46.4
(2) 42.2a ϩ 291.20b ϩ 2275.35c ϭ 230.42
291.20a ϩ 2275.35b ϩ 18971.92c ϭ 1449.00
Solving, a ϭ 2.588, b ϭ 2.065, c ϭ Ϫ0.2110; hence the required least-squares parabola has the equation
y ϭ 2.588 ϩ 2.065x Ϫ 0.2110x2
8.17. Use the least-squares parabola of Problem 8.16 to estimate the values of y from the given values of x.
For x ϭ 1.2, yest ϭ 2.588 ϩ 2.065(1.2) Ϫ 0.2110(1.2)2 ϭ 4.762. Similarly, other estimated values are
obtained. The results are shown in Table 8-10 together with the actual values of y.
Table 8-10
yest 4.762 5.621 6.962 7.640 7.503 6.613 4.741 2.561
y 4.5 5.9 7.0 7.8 7.2 6.8 4.5 2.7
Multiple regression
8.18. A variable z is to be estimated from variables x and y by means of a regression equation having the form
z ϭ a ϩ bx ϩ cy. Show that the least-squares regression equation is obtained by determining a, b, and c
so that they satisfy (21), page 269.
286 CHAPTER 8 Curve Fitting, Regression, and Correlation
Let the sample points be (x1, y1, z1), . . . , (xn, yn, zn). Then the values of z on the least-squares regression plane
corresponding to (x1, y1), . . . , (xn, yn) are, respectively,
a ϩ bx1 ϩ cy1, c, a ϩ bxn ϩ cyn
Therefore, the deviations from z1, . . . , zn are given by
d1 ϭ a ϩ bx1 ϩ cy1 Ϫ z1, c, dn ϭ a ϩ bxn ϩ cyn Ϫ zn
and the sum of the squares of the deviations is given by
ad 2 ϭ a(a ϩ bx ϩ cy Ϫ z)2
Considering this as a function of a, b, c and setting the partial derivatives with respect to a, b, and c equal to
zero, the required normal equations (21) on page 269, are obtained.
8.19. Table 8-11 shows the weights z to the nearest pound, heights x to the nearest inch, and ages y to the near-
est year, of 12 boys, (a) Find the least-squares regression equation of z on x and y. (b) Determine the esti-
mated values of z from the given values of x and y. (c) Estimate the weight of a boy who is 9 years old and
54 inches tall.
Table 8-11
Weight (z) 64 71 53 67 55 58 77 57 56 51 76 68
Height (x) 57 59 49 62 51 50 55 48 52 42 61 57
Age (y) 8 10 6 11 8 7 10 9 10 6 12 9
(a) The linear regression equation of z on x and y can be written
z ϭ a ϩ bx ϩ cy
The normal equations (21), page 269, are given by
az ϭ na ϩ b ax ϩ c ay
(1) axz ϭ a ax ϩ b ax2 ϩ c axy
ayz ϭ a ay ϩ b axy ϩ c ay2
The work involved in computing the sums can be arranged as in Table 8-12.
Table 8-12
z x y z2 x2 y2 xz yx xy
64 57 8 4096 3249 64 3648 512 456
710 590
71 59 10 5041 3481 100 4189 318 294
737 682
53 49 6 2809 2401 36 2597 440 408
406 350
67 62 11 4489 3844 121 4154 770 550
513 432
55 51 8 3025 2601 64 2805 560 520
306 252
58 50 7 3364 2500 49 2900 912 732
612 513
77 55 10 5929 3025 100 4235
gyz ϭ gxy ϭ
57 48 9 3249 2304 81 2736 6796 5779
56 52 10 3136 2704 100 2912
51 42 6 2601 1764 36 2142
76 61 12 5776 3721 144 4636
68 57 9 4624 3249 81 3876
gz ϭ gx ϭ gy ϭ gz2 ϭ gx2 ϭ gy2 ϭ gxz ϭ
753 643 106 48,139 34,843 976 40,830
CHAPTER 8 Curve Fitting, Regression, and Correlation 287
Using this table, the normal equations (1) become
12a ϩ 643b ϩ 106c ϭ 753
(2) 643a ϩ 34,843b ϩ 5779c ϭ 40,830
106a ϩ 5779b ϩ 976c ϭ 6796
Solving, a ϭ 3.6512, b ϭ 0.8546, c ϭ 1.5063, and the required regression equation is
(3) z ϭ 3.65 ϩ 0.855x ϩ 1.506y
(b) Using the regression equation (3), we obtain the estimated values of z, denoted by zest, by substituting the
corresponding values of x and y. The results are given in Table 8-13 together with the sample values of z.
Table 8-13
zest 64.414 69.136 54.564 73.206 59.286 56.925 65.717 58.229 63.153 48.582 73.857 65.920
z 64 71 53 67 55 58 77 57 56 51 76 68
(c) Putting x ϭ 54 and y ϭ 9 in (3), the estimated weight is zest ϭ 63.356, or about 63 lb.
Standard error of estimate
8.20. If the least-squares regression line of y on x is given by y ϭ a ϩ bx, prove that the standard error of esti-
mate sy.x is given by
s2y.x ϭ ay2 Ϫ a ay Ϫ b axy
n
The values of y as estimated from the regression line are given by yest ϭ a ϩ bx. Then
sy2.x ϭ a(y Ϫ yest)2 ϭ a(y Ϫ a Ϫ bx)2
n n
ϭ a y(y Ϫ a Ϫ bx) Ϫ a a(y Ϫa Ϫ bx) Ϫ b ax(y Ϫ a Ϫ bx)
n
But a(y Ϫ a Ϫ bx) ϭ ay Ϫ an Ϫ b ax ϭ 0
ax(y Ϫ a Ϫ bx) ϭ axy Ϫ a ax Ϫ b ax2 ϭ 0
since from the normal equations
ay ϭ an ϩ b ax axy ϭ a ax ϩ b ax2
Then s2y.x ϭ ay(y Ϫ a Ϫ bx) ϭ ay2 Ϫ a ay Ϫ b axy
n n
This result can be extended to nonlinear regression equations.
8.21. Prove that the result in Problem 8.20 can be written
s2y.x ϭ a(y Ϫ y#)2 Ϫ b a(x Ϫ x#)(y Ϫ y#)
n
Method 1
Let x ϭ xr ϩ x#, y ϭ yr ϩ y#. Then from Problem 8.20
ns2y.x ϭ ay2 Ϫ a ay Ϫ b axy
ϭ a(yr ϩ y#)2 Ϫ a a(yr ϩ y#) Ϫ b a(xr ϩ x#)(yr ϩ y#)
ϭ a(yr2 ϩ 2yr y# ϩ y#2) Ϫ aQ ayr ϩ ny# R Ϫ b a(xryr ϩ x#yr ϩ xr y# ϩ x# y#)
288 CHAPTER 8 Curve Fitting, Regression, and Correlation
ϭ ayr2 ϩ 2y# ayr ϩ ny#2 Ϫ any# Ϫ b axryr Ϫ bx# ayr Ϫ by# axr Ϫ bnx# y#
ϭ ayr2 ϩ ny#2 Ϫ any# Ϫ b axryr Ϫ bnx# y#
ϭ ayr2 Ϫ b axryr ϩ ny#(y# Ϫ a Ϫ bx#)
ϭ ayr2 Ϫ b axryr
ϭ a( y Ϫ y#)2 Ϫ b a(x Ϫ x#)( y Ϫ y#)
where we have used the results g xr ϭ 0, gyr ϭ 0 and y# ϭ a ϩ bx# (which follows on dividing both sides of
the normal equation gy ϭ an ϩ bg x by n). This proves the required result.
Method 2
We know that the regression line can be written as y Ϫ y# ϭ b(x Ϫ x#), which corresponds to starting with
y ϭ a ϩ bx and then replacing a by zero, x by x Ϫ x# and y by y Ϫ y#. When these replacements are made in
Problem 8.20, the required result is obtained.
8.22. Compute the standard error of estimate, sy.x, for the data of Problem 8.11.
From Problem 8.11(b) the regression line of y on x is y ϭ 35.82 ϩ 0.476x. In Table 8-14 are listed the actual
values of y (from Table 8-3) and the estimated values of y, denoted by yest, as obtained from the regression line.
For example, corresponding to x ϭ 65, we have yest ϭ 35.82 ϩ 0.476(65) ϭ 66.76.
Table 8-14
x 65 63 67 64 68 62 70 66 68 67 69 71
y 68 66 68 65 69 66 68 65 71 67 68 70
yest 66.76 65.81 67.71 66.28 68.19 65.33 69.14 67.24 68.19 67.71 68.66 69.62
yϪyest 1.24 0.19 0.29 Ϫ1.28 0.81 0.67 Ϫ1.14 Ϫ2.24 2.81 Ϫ0.71 Ϫ0.66 0.38
Also listed are the values y Ϫ yest, which are needed in computing sy?x.
sy2.x ϭ a(y Ϫ yest)2 ϭ (1.24)2 ϩ (0.19) ϩ c ϩ (0.38)2 ϭ 1.642
n 12
and sy.x ϭ !1.642 ϭ 1.28 inches.
8.23. (a) Construct two lines parallel to the regression line of Problem 8.11 and having vertical distance sy?x from
it. (b) Determine the percentage of data points falling between these two lines.
(a) The regression line y ϭ 35.82 ϩ 0.476x as obtained in Problem 8.11 is shown solid in Fig. 8-11. The two
parallel lines, each having vertical distance sy?x ϭ 1.28 (see Problem 8.22) from it, are shown dashed in
Fig. 8-11.
Fig. 8-11
CHAPTER 8 Curve Fitting, Regression, and Correlation 289
(b) From the figure it is seen that of the 12 data points, 7 fall between the lines while 3 appear to lie on the
lines. Further examination using the last line in Table 8-14 reveals that 2 of these 3 points lie between the
lines. Then the required percentage is 9>12 ϭ 75%.
Another method
From the last line in Table 8-14, y Ϫ yest lies between Ϫ1.28 and 1.28 (i.e., Ϯsy.x) for 9 points (x, y). Then the
required percentage is 9>12 ϭ 75%.
If the points are normally distributed about the regression line, theory predicts that about 68% of the points
lie between the lines. This would have been more nearly the case if the sample size were large.
NOTE: A better estimate of the standard error of estimate of the population from which the sample heights
were taken is given by ^sy.x ϭ !n>(n Ϫ 2)sy.x ϭ !12>10(1.28) ϭ 1.40 inches.
The linear correlation coefficient
8.24. Prove that g( y Ϫ y#)2 ϭ g( y Ϫ yest)2 ϩ g( yest Ϫ y#)2.
Squaring both side of y Ϫ y# ϭ ( y Ϫ yest) ϩ ( yest Ϫ y#) and then summing, we have
a( y Ϫ y#)2 ϭ a( y Ϫ yest)2 ϩ a( yest Ϫ y#)2 ϩ 2 a( y Ϫ yest)( yest Ϫ y#)
The required result follows at once if we can show that the last sum is zero. In the case of linear regression this
is so, since
a(y Ϫ yest)(yest Ϫ y#) ϭ a(y Ϫ a Ϫ bx)(a ϩ bx Ϫ y#)
ϭ a a(y Ϫ a Ϫ bx) ϩ b ax(y Ϫ a Ϫ bx) Ϫ y# a(y Ϫ a Ϫ bx)
ϭ0
because of the normal equations g(y Ϫ a Ϫ bx) ϭ 0, gx(y Ϫ a Ϫ bx) ϭ 0.
The result can similarly be shown valid for nonlinear regression using a least-squares curve given by
yest ϭ a0 ϩ a1x ϩ a2x2 ϩ c ϩ anxn.
8.25. Compute (a) the explained variation, (b) the unexplained variation, (c) the total variation for the data of
Problem 8.11.
We have y# ϭ 67.58 from Problem 8.12 (or from Table 8-4, since y# ϭ 811>12 ϭ 67.58). Using the values yest
from Table 8-14 we can construct Table 8-15.
Table 8-15
yest Ϫ y# Ϫ0.82 Ϫ1.77 0.13 Ϫ1.30 0.61 Ϫ2.25 1.56 Ϫ0.34 0.61 0.13 1.08 2.04
(a) Explained variation ϭ g(yest Ϫ y#)2 ϭ (Ϫ0.82)2 ϩ c ϩ (2.04)2 ϭ 19.22.
(b) Unexplained variation ϭ g(y Ϫ yest)2 ϭ ns2y.x ϭ 19.70, from Problem 8.22.
(c) Total variation ϭ g(y Ϫ y#)2 ϭ 19.22 ϩ 19.70 ϭ 38.92, from Problem 8.24.
The results in (b) and (c) can also be obtained by direct calculation of the sum of squares.
8.26. Find (a) the coefficient of determination, (b) the coefficient of correlation for the data of Problem 8.11. Use
the results of Problem 8.25.
(a) Coefficient of determination ϭ r2 ϭ explained variation ϭ 19.22 ϭ 0.4938.
total variation 38.92
(b) Coefficient of correlation ϭ r ϭ Ϯ !0.4938 ϭ Ϯ0.7027.
Since the variable yest increases as x increases, the correlation is positive, and we therefore write
r ϭ 0.7027, or 0.70 to two significant figures.
290 CHAPTER 8 Curve Fitting, Regression, and Correlation
8.27. Starting from the general result (30), page 270, for the correlation coefficient, derive the result (34),
page 271 (the product-moment formula), in the case of linear regression.
The least-squares regression line of y on x can be written yest ϭ a ϩ bx or yrest ϭ bxr, where
b ϭ gxryr> gxr2, xr ϭ x Ϫ x#, and yrest ϭ yest Ϫ y#. Then, using yr ϭ y Ϫ y#, we have
r2 ϭ explained variation a( yest Ϫ y#)2 ϭ a yre2st
total variation ϭ a( y Ϫ y#)2 ayr2
2 2
a b2xr2 b2 axr2 ϭ £ axryr≥ a xr2 Q axryrR
≥
ϭ ayr2 ϭ ayr2 £ ϭ axr2 ayr2
axr2 ayr2
and so r ϭ Ϯ axryr
$axr2 ayr2
However, since gxryr is positive when yest increases as x increases, but negative when yest decreases as x
increases, the expression for r automatically has the correct sign associated with it. Therefore, the required
result follows.
8.28. By using the product-moment formula, obtain the linear correlation coefficient for the data of Problem 8.11.
The work involved in the computation can be organized as in Table 8-16. Then
r ϭ axryr ϭ 40.34 ϭ 0.7027
!(84.68)(38.92)
Q axr2R Q ayr2R
B
agreeing with Problem 8.26(b).
Table 8-16
x y xr ϭ yr ϭ xr2 xryr yr2
x Ϫ x# y Ϫ y#
65 68 Ϫ1.7 2.89 Ϫ 0.68 0.16
63 66 Ϫ3.7 0.4 13.69 5.92 2.56
67 68 Ϫ1.6 0.09 0.12 0.16
64 65 0.3 7.29 7.02 6.76
68 69 Ϫ2.7 0.4 1.69 1.82 1.96
62 66 Ϫ2.6 22.09 7.52 2.56
70 68 1.3 10.89 1.32 0.16
66 65 Ϫ4.7 1.4 0.49 1.82 6.76
68 71 Ϫ1.6 1.69 4.42 11.56
67 67 3.3 0.09 0.36
69 68 Ϫ0.07 0.4 5.29 Ϫ 0.18 0.16
71 70 Ϫ2.6 18.49 0.92 5.76
1.3 10.32
gx ϭ 800 gy ϭ 811 0.3 3.4 gxr2 ϭ gyr2 ϭ
x# ϭ 800>12 y# ϭ 811>12 2.3 Ϫ0.6 84.68 gxryr ϭ 38.92
ϭ 66.7 ϭ 67.6 4.3 40.34
0.4
2.4
CHAPTER 8 Curve Fitting, Regression, and Correlation 291
8.29. Prove the result (17), page 268.
The regression line of y on x is
rsy
y ϭ a ϩ bx where b ϭ sx
Similarly, the regression line of x on y is
x ϭ c ϩ dy where d ϭ rsx
sy
Then bd ϭ a rsy b a rsx b ϭ r2
sx sy
8.30. Use the result of Problem 8.29 to find the linear correlation coefficient for the data of Problem 8.11.
From Problem 8.11(b) and 8.11(c), respectively,
b ϭ 484 ϭ 0.476 d ϭ 484 ϭ 1.036
1016 467
Then r2 ϭ bd ϭ a 484 b a 484 b or r ϭ 0.7027
1016 467
agreeing with Problems 8.26(b) and 8.28.
8.31. Show that the linear correlation coefficient is given by
n axy Ϫ Q axR Q ayR
rϭ
22
Sn ax2 Ϫ Q axR T Sn ay2 Ϫ Q ayR T
B
In Problem 8.27 it was shown that
(1) r ϭ axryr ϭ a(x Ϫ x#)(y Ϫ y#)
Q axr2R Q ayr2R S a(x Ϫ x#)2T S a(y Ϫ y#)2T
BB
But a(x Ϫ x#)(y Ϫ y#) ϭ a(xy Ϫ x#y Ϫ xy# ϩ x# y#) ϭ axy Ϫ x# ay Ϫ y# ax ϩ nx# y#
ϭ axy Ϫ nx# y# Ϫ ny# x# ϩ nx# y# ϭ axy Ϫ nx# y#
Q axRQ ayR
ϭ axy Ϫ n
since x# ϭ (gx)>n and y# ϭ (gy)>n.
Similarly, a(x Ϫ x#)2 ϭ a(x2 Ϫ 2xx# ϩ x#2) ϭ ax2 Ϫ 2x# ax ϩ nx#2
and
22 2
2Q axR Q axR Q axR
ϭ ax2 Ϫ n ϩ n ϭ ax2 Ϫ n
2
Q ayR
a(y Ϫ y#)2 ϭ ay2 Ϫ n
Then (1) becomes
axy Ϫ Q axR Q ayR >n n axy Ϫ Q axR Q ayR
rϭ ϭ 22
22
Sn ax2 Ϫ Q axR T Sn ay2 Ϫ Q ayR T
S ax2 Ϫ Q axR >nT S ay2 Ϫ Q ayR >nT
BB