L E S S O N 1.6 • Measuring Center 51
dd dd dd d dd d
dd
10 15 30 40
20 25 35
Fat (g)
40 Sort the data values from smallest to largest. Because
there are n = 12 data values (an even number), the
eef median is the average of the middle two values in the
less. ordered list.
FOR PRACTICE TRY EXERCISE 3.
a set is the average of all n data
by n. That is,
es
ues
xn, we can rewrite the formula for
= ∑ xi
n
a is short for “add them all up.”
way of keeping the n data values
or any other special facts about
29/03/16 9:10 pm
52 C H A P T E R 1 • Analyzing One -Variable Data
Actually, the notation x refers
we’ll encounter can be thought o
we need to refer to a population m
nounced “mew”).
EXAMPLE
a How high is the score?
Finding the mean
PROBLEM: How many runs does the Lawrence High School ba
are data on the number of runs scored by the team in all 21 ga
along with a dotplot.
01112222333445
d
ddd
dddddd
ddddddddd
0123456789
Runs scored
(a) Calculate the mean number of runs scored per game by th
Show your work.
(b) The dotplot suggests that the game in which the team sco
Calculate the mean number of runs scored per game by th
season. What do you notice?
SOLUTION:
(a) x = 0 + 1 + 1 + c+ 10 + 12 = 87 ≈ 4.14 runs per g
21 21
(b) x = 0 + 1 + 1+ c+ 10 = 75 ≈ 3.75 runs per game
20 20
The team’s 12-run game increased the mean number of runs scored
caution The previous example illustrates a
center: The mean is not resistant.
!
D E F I N I T I O N Resistant
A measure of center (or variability)
unusually small values in a distribut
The median is a resistant meas
number of runs scored by the Law
If we remove the possible outlier
number of runs scored in the rem
Starnes_3e_CH01_002-093_v2.indd 52
a
to the mean of a sample. Most of the time, the data
of as a sample from some larger population. When
mean, we’ll use the symbol µ (Greek letter mu, pro-
aseball team usually score? Here
ames played during a recent season,
5 5 6 6 7 8 10 12
dd
10 11 12
he team during this season.
ored 12 runs is a possible outlier.
he team in the other 20 games this
game You can calculate the mean with the formula:
x= x1 + x2 + c+ xn = ∑ xi
n
n
per game by 0.39 runs.
FOR PRACTICE TRY EXERCISE 5.
an important weakness of the mean as a measure of
is resistant if it isn’t influenced by unusually large or
tion.
sure of center. In the preceding example, the m edian
wrence High baseball team in its 21-game season is 3.
game in which the team scored 12 runs, the median
maining 20 games is still 3.
29/03/16 9:10 pm
Activity
Mean as a “balance point”
In this activity, you will investigate an important
property of the mean.
1. Stack 5 pennies above the 6-inch mark on a rul
Place a pencil under the ruler to make a “seesaw
on a desk or table. Move the pencil until the ru
balances. What is the relationship between the
location of the pencil and the mean of the five
data values: 6, 6, 6, 6, 6?
2. Move one penny off the stack to the 8-inch ma
on your ruler. Now move one other penny so
that the ruler balances again without moving t
pencil. Where did you put the other penny? Wh
is the mean of the five data values represented
the pennies now?
3. Move one more penny off the stack to the 2-inc
mark on your ruler. Now move both remaining
pennies from the 6-inch mark so that the ruler
The activity gives a physical interpretation of th
distribution. For the data on runs scored in each gam
team in a recent season, the dotplot balances at x =
d
ddd
dddddd
ddddddddd
02468
Runs scored
Comparing the Mean and the Media
Which measure—the mean or the median—should we
tion? That depends on both the shape of the distribution
Shape: Figure 1.8 shows the mean and median for do
N otice how these two measures of center compare in
the direction of the long tail in a skewed distribution
Skewed to the left Roughly symm
Mean < Median Mean ≈ Med
Median = 19 d
d
d d d d d M
dd dd d d
dddd d d 13 d
13 14 dddd d d d
d d (b) d dd d
(a) 15 16 17 18 d d d dd dd dddddd d
Mean = 17.97 d
19 d 14 15 16
Quiz score M
20
Capacity (f
FIGURE 1.8 Dotplots that show the relationship between the
different shapes: (a) Scores on an easy statistics quiz, (b) usable
refrigerators, and (c) runs scored by the Lawrence High School b
Starnes_3e_CH01_002-093_v2.indd 53
L E S S O N 1.6 • Measuring Center 53
ler. Ann Heath
w”
uler
e
ark
the
hat
d by
still balances with the pencil in the same location.
ch Is the mean of the data values still 6?
g 4. Discuss with your classmates: Why is the mean
called the “balance point” of a distribution?
he mean as the balance point of a
me by the Lawrence High baseball
4.14 runs.
dd
10 12
an
e report as the center of a distribu-
n and whether there are any outliers.
otplots with three different shapes.
n each case. The mean is pulled in
n.
metric Skewed to the right
dian Mean > Median
Median = 15.8 Median = 3
d d d d
dd d d ddd
dddddddddd d d dddddd 12
dd dd d dd d d
6 17 18
Mean = 15.78 0 2 4 6 8 10
ft3) Mean = 4.15
(c) Runs scored
mean and median in distributions with
capacity in a sample of 36 side-by-side
baseball team in 21 games played.
29/03/16 9:10 pm
54 C H A P T E R 1 • Analyzing One -Variable Data
Outliers: We noted earlier that
dotplot below shows the EPA esti
for a sample of 21 model year 2
—the Toyota Prius, with its 51 mpg
median stays at 21 mpg. See the rig
center, but the mean is not.
With outlier
d Median = 21 Outl
dd d
dd 50
ddddd d
dddddd d d d
15 20 25 30 35 40 45
City gas mileage (mpg)
x = 22.52
You can compare how the mea
applet at the book’s website, high
Choosing a Measure of C
jj If a distribution of quantitative
mean to measure center.
jj If the distribution is strongly sk
EXAMPLE
a How bad is the traffic?
Comparing the mean and median
PROBLEM: At the beginning of the lesson, we 0
presented data on the travel times in minutes of 20
randomly chosen New York workers. Here is a dotplot
of the data.
(a) Explain why the mean is so much larger than the
median.
(b) Which measure of center better describes a typical
travel time to work? Explain.
SOLUTION:
(a) The mean is pulled toward the long tail in this right-
skewed distribution. Also, the possible outlier of 85
minutes inflates the mean but does not affect the
median as much.
(b) The median of 22.5 minutes better summarizes the center
of the distribution because the travel-time distribution is
skewed to the right and has a possible outlier.
Starnes_3e_CH01_002-093_v2.indd 54
a
the mean is sensitive to extreme values. The left
imates of city gas mileage in miles per gallon (mpg)
2014 midsize cars.43 If we remove the clear outlier
g in the city—the mean falls to x = 21.1 mpg, but the
ght dotplot below. The median is a resistant measure of
Without outlier
lier d Median = 21
d dd
dd
d dd dd d
dddddd d d d
15 20 25 30 35 40 45 50
City gas mileage (mpg)
x = 21.1
an and median behave by using the Mean and M edian
hschool.bfwpub.com/spa3e.
Center
e data is roughly symmetric and has no outliers, use the
kewed or has outliers, use the median to measure center.
Median = 22.5
d
dd
ddd d d d
dd dd
dd d dd d d
10 20 30 40 50 60 70 80 90
Mean = 31.25
Travel time (min)
The mean of 31.25 minutes does not reflect a typical
travel time—only 7 of the 20 New Yorkers in the sample
reported travel times this long or longer.
FOR PRACTICE TRY EXERCISE 9.
29/03/16 9:10 pm
l e sso n A pp 1. 6
Is the pace of life slower in smaller cities
Does it take less time to get to work in smaller cities? H
in minutes for 15 workers in North Carolina, chosen at
Bureau, along with a dotplot of the data.44
30 20 10 40 25 20 10 60 15 40 5
d d
d d
d dd
d dd d d d d
0 5 10 15 20 25 30 35 40 45
Travel time to work (min)
1. Find the median. Interpret this value in context.
2. Calculate the mean travel time. Show your work.
3. Which measure of center—the median or the me
typical travel time to work for this sample of work
Justify your answer.
Lesson 1.6
W h a t D i d Yo u L e a r n ?
Learning Target
Find and interpret the median of a distribution of quan
Calculate the mean of a distribution of quantitative dat
Compare the mean and median of a distribution, and c
more appropriate measure of center in a given setting.
Exercises Lesson 1.6
Mastering Concepts and Skills
1. Quiz grades Joey’s first 13 quiz grades in a markin
pg 50 period are listed here. Find and interpret the median
82 93 77 79 90 82 85 85 95 73 79 83 89
2. Big boys The roster of the Dallas Cowboys profes
sional football team in a recent season included
defensive linemen. Their weights (in pounds) wer
321, 285, 300, 285, 286, 293, and 298. Find an
interpret the median.
Starnes_3e_CH01_002-093_v2.indd 55
L E S S O N 1.6 • Measuring Center 55
s?
Here are the travel times
t random by the Census
30 12 10 10
d
50 55 60
ean— better describes a © Philip Scalia/Alamy Stock Photo
kers in North Carolina?
ntitative data. Examples Exercises
p. 50 1–4
ta. p. 52 5–8
p. 54 9–12
choose the
.
3. Large fries Ryan and Brent were curious about
pg 50 the amount of french fries they would get in a
ng large order from their favorite fast-food restau-
n. rant, Burger King. They went to several different
Burger King restaurants over a series of days and
9 ordered a total of 14 large fries. The weight of
each order (in grams) is shown here. Find and
s- interpret the median.
7
re
nd 165 163 160 159 166 152 166 168 173 171 168 167 170 170
29/03/16 9:10 pm
56 C H A P T E R 1 • Analyzing One -Variable Data
4. Carrots The weights (in grams) of 12 carrots in a
single bag from a local grocery store are listed here.
Find and interpret the median.
44 56 48 41 66 55 42 33 51 44 61 65
5. Skipped quiz Refer to Exercise 1. (a
(a) Calculate Joey’s mean quiz grade. Show your work. (b
(b) Joey has an unexcused absence for the 14th quiz, 11
pg 52 and he receives a score of zero. Recalculate the
12
mean and median. Explain why the mean and me-
dian are so different now.
6. Big outlier Refer to Exercise 2.
(a) Calculate the mean weight of the 7 defensive line-
men. Show your work.
(b) The defensive lineman that weighed 321 pounds may
be an outlier. How did this player affect the mean?
Justify your answer with an appropriate calculation.
7. Mean fries Refer to Exercise 3.
(a) Calculate the mean weight for the 14 orders of
large fries. Show your work.
(b) Ryan and Brent noticed a shortage of fries in the
order that weighed 152 grams. How did this order
affect the mean? Justify your answer with an ap-
propriate calculation.
8. One more carrot Refer to Exercise 4.
(a) Calculate the mean weight of the carrots. Show
your work.
(b) The 13th carrot in the bag weighed 93 grams.
Recalculate the mean and median. Explain why the
mean and median are so different now.
9. Birthrates in Africa One of the important factors in
determining population growth rates is the birth-
pg 54 rate per 1000 individuals in a population. The dot-
plot shows the birthrates per 1000 individuals for
54 African nations:
d
dd
dd d
dd d dd d d
dd d d dd dddd d dd
d dd d dd d d ddd d dd ddd d dd dd d d ddd d
18 24 30 36 42 48 54
Birthrate (per 1000 population)
(a) Explain how the mean and median would compare.
(b) Which measure of center better describes a typical
birthrate? Explain.
10. Electing the president To become president of the
United States, a candidate does not have to receive
a majority of the popular vote. The candidate does
have to win a majority of the 538 electoral votes
that are cast in the Electoral College. Here is a stem-
plot of the number of electoral votes in 2016 for
each of the 50 states and the District of Columbia.
Starnes_3e_CH01_002-093_v2.indd 56
a
0 3333333344444
0 55566666677788999
1 00001111234
1 5668
2 00
2 99
3
38
4 KEY: 1 | 5 is a state
4 with 15 electoral
5 votes.
55
a) Explain how the mean and median would compare.Number of students
b) Which measure of center better describes a typical
number of electoral votes? Explain.
1. Smart kids The histogram displays the IQ scores
of 60 randomly selected fifth-grade students from
one school. Which measure of center is the more
appropriate choice in this setting? Explain.
18
16
14
12
10
8
6
4
2
80 90 100 110 120 130 140 150 160
IQ
2. Lightning The histogram displays data from a study
of lightning storms in Colorado.45 It shows the dis-
tribution of how long after midnight (in hours)
until the first lightning flash for that day occurred.
Which measure of center is the more appropriate
choice in this setting? Explain.
25
Count of first lightning flashes 20
15
10
5
0
7 8 9 10 11 12 1 2 3 4 5
Time after midnight until first lightning flash (h)
29/03/16 9:10 pm
Number of subjectsApplying the Concepts
Percent of words 13. Do adolescent girls eat fruit? We all know tha
fruit is good for us. Following is a histogram o
the number of servings of fruit per day claimed b
74 seventeen-year-old girls in a study in Pennsy
vania.46 Find the mean and median. Show you
method clearly.
15
10
5
0
012345678
Servings of fruit per day
14. Shakespeare The following histogram shows the dis
tribution of lengths of words used in Shakespeare
plays.47 Find the mean and median. Show you
method clearly.
25
20
15
10
5
0
1 2 3 4 5 6 7 8 9 10 11 12
Number of letters in word
15. How much for that house? The mean and media
selling prices of existing single-family homes sold i
the United States in May 2014 were $260,700 an
$213,600.48
(a) Which of these numbers is the mean and which i
the median? Explain your reasoning.
(b) Write a sentence to describe how an unethical pol
tician could use these statistics to argue that Ma
2014 home prices were too high.
16. How mean is this salary? Last year a small accountin
firm paid each of its five clerks $44,000, two junio
accountants $100,000 each, and the firm’s owne
$540,000. Write a sentence to describe how a
unethical recruiter could use statistics to mislea
prospective employees.
Starnes_3e_CH01_002-093_v2.indd 57
L E S S O N 1.6 • Measuring Center 57
17. Baseball salaries, means and medians Suppose that
a Major League Baseball team’s mean yearly salary
at for its players is $2.3 million and that the team has
of 25 players on its active roster.
by (a) What is the team’s total annual payroll?
yl-
ur (b) If you knew only the median salary, would you be
able to answer this question? Why or why not?
18. Mean or median? You are planning a party for 30
guests and want to know how many cans of soda
to buy. Earl, the soda elf, offers to tell you either
the mean number of cans guests will drink or the
median number of cans. Which measure of center
should you ask for? Why?
Extending the Concepts
Another measure of center for a quantitative data set is
the trimmed mean. To calculate the trimmed mean, order
the data set from lowest to highest, remove the same
number of data values from each end, and calculate the
mean of the remaining values. For a data set with 10
values, for example, we can calculate the 10% trimmed
mean by removing the maximum and minimum value.
s- Why? Because that’s one value trimmed from each “end”
e’s of the data set out of 10 values, and 1/10 = 0.10 or 10%.
ur
19. Shoes How many pairs of shoes does a typical
teenage boy own? To find out, a group of statis-
tics students surveyed a random sample of 20 male
students from their large high school. Then they re-
corded the number of pairs of shoes that each boy
owned. Here are the data, along with a dotplot.
14 7 6 5 12 38 8 7 10 10 10 11 4 5 22 7 5 10 35 7
d d d d dd
dd d
dd d
d dddd ddd
0 5 10 15 20 25 30 35 40
Shoes
(a) Calculate the mean of the distribution.
(b) Calculate the 10% trimmed mean.
(c) Why is the trimmed mean a better summary of the
an center of this distribution than the mean?
in
nd Recycle and Review
20. File sizes (1.3) How much disk space does your
music use? Here are the file sizes (in megabytes) for
is 18 randomly selected files on Gabriel’s mp3 player.
li- 2.4 2.7 1.6 1.3 6.2 1.3 5.6 1.1 2.2
ay 1.9 2.1 4.4 4.7 3.0 1.9 2.5 7.5 5.0
ng (a) Make a dotplot to display the data.
or
er (b) Explain what the dot above the number 5 on your
an dotplot represents.
(c) What percent of the files are larger than 2 megabytes?
ad (d) Use the dotplot to describe the shape of the distri-
bution of file sizes.
29/03/16 9:10 pm
58 C H A P T E R 1 • Analyzing One -Variable Data
21. He shoots, he scores! (1.5) Lebron James and Kevin
Durant were two of the most prolific scorers in the
National Basketball Association in 2013–2014. The
following histograms display the distribution of points
per game for all the regular season games in which
each of them played. Compare the distributions.
Kevin Durant
30
25
Percent 20
15
10
5
0 20 30 40 50 60
10 Points per game
Lesson 1.7
Measuring V
Learning Ta
dd Find the range of a distributi
dd Find and interpret the interq
dd Calculate and interpret the s
Being able to describe the shape a
two distributions can have the sam
The parallel dotplots show the
most recent games. Both distribut
around 150. But the variability of
that Kelly is a much more consis
much less variable than his.
Starnes_3e_CH01_002-093_v2.indd 58
a
Percent 30 Lebron James
25
20 20 30 40 50 60
15 Points per game
10
5
0
10
Variability
argets
ion of quantitative data.
quartile range.
standard deviation.
and center of a distribution is a great start. However,
me shape and center, but still look quite different.
e scores of two bowlers (Earl and Kelly) in their 100
tions are symmetric and single-peaked, with centers
f these two distributions is quite different. It appears
stent bowler than Earl. Her distribution of scores is
29/03/16 9:10 pm
d
dd
dd
dd
dd
d dd d
dd dd d dd
dd ddddddddddddddd
d d dd ddddddddddddddd
ddd dd ddd dddddddddddddddddd
Earl
Kelly 50 d
d
d
d dd
d dd
d d dd
ddddd d
dddddd d
ddddddddd d
ddddddddd d
ddddddddd d
dd dd dd dddd d d
d dd dd dd dddd d d
d d ddddddddddddddd
75 100 125 150 175
Bowling scores
There are several ways to measure the variability
common are the range, interquartile range, and stan
The Range
The simplest measure of variability is the range.
D E F I N I T I O N Range
The range of a distribution is the distance between the
mum value. That is,
range = maximum − mini
Note that the range of a data set is a single numb
sometimes say things like, “The data values range f
term range correctly, now that you know its statistic
New Yorkers rushing to work?
Finding the range
PROBLEM: How long do people typically spend trave
minutes of 20 randomly chosen New York workers, alo
distribution.
10 30 5 25 40 20 10 15 30 20 15
d
dd
d dd d d
dd
d d dd dd
0 10 20 30 40 50
Travel time to wo
SOLUTION:
Range = Maximum – Minimum = 85 – 5 = 80 min
Starnes_3e_CH01_002-093_v2.indd 59
L E S S O N 1.7 • Measuring Variability 59
d d
d d
d
d dd
dd ddd
dd d ddddd
dd d
200 225 250
y of a distribution. The three most
ndard deviation.
e minimum value and the maxi-
imum
ber. In everyday language, people
from 5 to 85.” Be sure to use the
cal definition.
EXAMPLE
eling to work? Here are the travel times in
ong with a dotplot.49 Find the range of the
20 85 15 65 15 60 60 40 45
d d
dd
80 90
60 70
ork (min)
FOR PRACTICE TRY EXERCISE 1.
29/03/16 9:10 pm
Machine60 C H A P T E R 1 • Analyzing One -Variable Data
The range is not a resistant m
imum and minimum values, whi
travel-time data. Without the pos
would decrease to 65 – 5 = 60 m
The graph illustrates anoth
v ariability. The parallel dotplot
nails produced by each of two
70 millimeters (mm) and have a
nails made by Machine B clearl
nails made by Machine A.
Ad
d
d
Bd
68
The Interquartile Rang
We can avoid the impact of extrem
on the middle of the distribution.
est to largest. Then find the quart
groups of roughly equal size. The
list. The second quartile is the m
tile Q3 lies three-quarters of the w
the middle half of the distribution
For example, here are the amo
$19, $26, $25, $37, $31, $28, $2
displays the data. Because there a
tion into 4 groups of 3 values.
The rst quartile Q1
d
dd
18 20 22
D E F I N I T I O N Quartiles, first qu
The quartiles of a distribution divid
roughly the same number of values
smallest to largest and find the med
The first quartile Q1 is the median
the ordered list.
The third quartile Q3 is the median
in the ordered list.
The interquartile range (IQR)
distribution.
Starnes_3e_CH01_002-093_v2.indd 60
a
measure of variability. It depends on only the max-
ich may be outliers. Look again at the New York
ssible outlier at 85 min, the range of the distribution
min.
her problem with the range as a measure of
ts show the lengths (in inches) of a sample of 11
o machines.50 Both distributions are centered at
a range of 72 – 68 = 4 mm. But the lengths of the
ly vary more from the center of 70 mm than the
d
d
d
ddd
dddd
dd d
ddd d
d
69 70 71
Length (mm) 72
ge
me values on our measure of variability by focusing
. Here’s the idea. Order the data values from small-
tiles, the values that divide the distribution into four
e first quartile Q1 lies one-quarter of the way up the
median, which is halfway up the list. The third quar-
way up the list. The first and third quartiles mark out
n.
ounts collected each hour by a charity at a local store:
22, $22, $29, $34, $39, and $31. The dotplot below
are 12 data values, the quartiles divide the distribu-
The second quartile
(median) The third quartile Q3
dd dd d d dd
d
36 38 40
24 26 28 30 32 34
Amount collected
uartile Q1, third quartile Q3
de the ordered data set into four groups having
s. To find the quartiles, arrange the data values from
dian.
of the data values that are to the left of the median in
n of the data values that are to the right of the median
measures the variability in the middle half of the
29/03/16 9:10 pm
D E F I N I T I O N Interquartile range (IQR)
The interquartile range (IQR) is the distance between
distribution. In symbols,
IQR = Q3 − Q1
A healthy range to choose from?
Finding and interpreting the IQR
PROBLEM: Here again are data on the amount of fat
chicken sandwiches:
19 16 22 33 27 9 20 14
Find the interquartile range. Interpret this value in con
SOLUTION:
9 14 16 19 19 20 22 27
Median
9 14 16 19 19 20 22 27
Q1 5 15 Median
9 14 16 19 19 20 22 27
Q1 5 15 Median Q3 5 24.5
The interquartile range is
IQR = Q3 − Q1 = 24.5 − 15 = 9.5 g
Interpretation: The range of the middle half of fat content
sandwiches is 9.5 grams.
The quartiles and the interquartile range are resis
by a few extreme values. For the fat content data, Q3
would still be 9.5 if the maximum were 53 rather th
The Standard Deviation
When we use the median to measure the center of
range (IQR) is our corresponding measure of variab
of a distribution with the mean, then we should use t
the variation of data values around the mean.
D E F I N I T I O N Standard deviation
The standard deviation measures the typical distance
the mean. To find the standard deviation sx of a quantita
1. Find the mean of the distribution.
2. Calculate the deviation of each value from the mean:
3. Square each of the deviations.
4. Add all the squared deviations, divide by n − 1, and
Starnes_3e_CH01_002-093_v2.indd 61
L E S S O N 1.7 • Measuring Variability 61
the first and third quartiles of a
EXAMPLE
(in grams) in 9 different McDonald’s fish and
19
ntext.
33 Sort the data values from smallest to largest and
find the median.
33 Find the first quartile Q1, which is the median of
the data values to the left of the median in the
ordered list.
33 Find the third quartile Q3, which is the median of
the data values to the right of the median in the
ordered list.
t values for these McDonald’s fish and chicken
FOR PRACTICE TRY EXERCISE 5.
stant because they are not affected
3 would still be 24.5 and the IQR
han 33.
f a distribution, the interquartile
bility. If we summarize the center
the standard deviation to describe
of the values in a distribution from
ative data set with n values:
: deviation = value – mean.
take the square root.
29/03/16 9:10 pm
62 C H A P T E R 1 • Analyzing One -Variable Data
If the values in a data set are gi
calculating the standard deviation
sx = Å (x1 − x)2 + (x2
Actually, the notation sx refers to t
the data we’ll encounter can be th
When we need to refer to a pop
(Greek lowercase sigma). This sta
squared deviations by n instead o
EXAMPLE
a How many pets?
Calculating and interpreting standard deviation
PROBLEM: Nine children were asked how many pets they had
from lowest to highest, along with a dotplot of the data.51
1344457
d
d
d ddd
0123456
Number of pets
Calculate the standard deviation. Interpret this value in contex
SOLUTION:
x = ∑ xi = 1+ 3 +4+4 + 4 +5+ 7 +8+ 9 = 45 =5
9 9
n
Value xi Deviation from Squared deviation
1 mean xi − x (xi − x)2
3 1 – 5 = –4 (−4)2 = 16
4 (−2)2 = 4
4 3 – 5 = –2 (−1)2 = 1
4 4 – 5 = –1 (−1)2 = 1
5 4 – 5 = –1 (−1)2 = 1
7 4 – 5 = –1
8 5–5=0 02 = 0
9 7–5=2 22 = 4
8–5=3 32 = 9
9–5=4 42 = 16
Sum = 52
sx = Å ∑ (xi − x )2 = 52 ≈ 2.55 pets
Å9 − 1
n−1
Interpretation: The numberof petsthesechildrenhavetypically
varies byabout 2.55petsfrom themean of5pets.
Starnes_3e_CH01_002-093_v2.indd 62
a
iven by x1, x2, . . ., xn, we can rewrite the formula for
n as
2 − x)2 + c+ (xn − x)2 = ∑ (xi − x)2
n−1 n−1
Å
the standard deviation of a sample. Most of the time,
hought of as a sample from some larger population.
pulation standard deviation, we’ll use the symbol σ
andard deviation is calculated by dividing the sum of
of n − 1 before taking the square root.
d. Here are their responses, arranged
89
ddd
789
xt.
5 pets 1. Find the mean of the distribution.
2. Calculate the deviation of each value
from the mean: deviation 5 value−mean
3. Square each of the deviations.
4. Add all the squared deviations,
divide by n − 1, and take the square
root to return to the original units (pets).
FOR PRACTICE TRY EXERCISE 9.
29/03/16 9:11 pm
Think About It Why is the standard deviati
way? Add up the deviations from the mean in the
get a sum of 0. Why? Because the mean is the balan
square the deviations to avoid the positive and neg
other out and adding to 0. It might seem strange to “
by dividing by n – 1. We’ll explain the reason for d
to understand why we take the square root: to retu
More important than the details of calculating sx
deviation as a measure of variability:
• sx is always greater than or equal to 0. sx = 0 only
is, when all values in a distribution are the same.
• Larger values of sx indicate greater variation from
instance, Earl’s bowling scores have a standard de
scores have a standard deviation of about 20.
d
dd
dd
dd
dd
d dd d
dd dd d dd
dd dddddddd dddddd
Earl d d dd dddddddd dddddd
d d d dd ddddd ddddddddd dddddd
Kelly d
50 d
d
d dd
d dd
dddd
ddddd d
dddddd d
ddddddddd d
ddddddddd d
ddddddddd d
dddddd dddddd
ddddddd dddddd
d d dddddddd dddddd
75 100 125 150 175
Bowling scores
• sx is not resistant. The use of squared deviatio
tive than x to extreme values in a distribution
deviation of the travel times for the 20 New Y
If we omit the maximum value of 85 min, the
18.34 min.
d
dd
ddd d d d
dd d
dddddd
0 10 20 30 40 50 60
Travel time to work (min
• sx measures variation about the mean. It should b
chosen as the measure of center.
Choosing Measures of Center and Vari
The median and IQR are usually better than the mean an
skewed distribution or a distribution with outliers. Use x
tions that don’t have outliers.
Starnes_3e_CH01_002-093_v2.indd 63
L E S S O N 1.7 • Measuring Variability 63
ion calculated in such a complex
e previous example. You should
nce point of the distribution. We
gative deviations balancing each
“average” the squared deviations
doing this in Chapter 6. It’s easier
urn to the original units (pets).
are the properties of the standard
y when there is no variability, that
m the mean of a distribution. For
eviation of about 40, while Kelly’s
d d
d d
d
d dd
dd ddd
ddd ddddd
dd d
200 225 250
ons makes sx even more sensi-
n. For example, the standard
York workers is 21.88 min.
standard deviation drops to
d d
70 80 100
n)
be used only when the mean is
iability
nd standard deviation for describing a
and sx for roughly symmetric distribu-
29/03/16 9:11 pm
64 C H A P T E R 1 • Analyzing One -Variable Data
l e sso n A pp 1. 7
Have we found the beef?
Here are data on the amount of fat (in grams) in 12 different McDo
sandwiches, along with a dotplot. The mean fat content for these
x = 22.833 grams.
27 11 22 21 40 8 17 15 29 31 27 26
dd dd dd d dd d
dd
10 15 30 40
20 25 35
Fat (g)
1. Find the range of the distribution.
2. Find the interquartile range. Interpret this value in context
3. Calculate the standard deviation. Interpret this value in co
4. The dotplot suggests that the Bacon Clubhouse Burger, wi
outlier. Recalculate the range, interquartile range, and stan
sandwiches. Compare these values with the ones you obta
Explain why each result makes sense.
TCOERCNaEHR Computing Numerical Summ
You can use an applet or a graphing calculator to cal- T
culate measures of center and variability. That will free
you up to concentrate on choosing the right numeri- 1
cal summaries and interpreting your results. For the 2
New York travel-time data (page 59), use either of the
following. d
Applet O
an
1. Go to highschool.bfwpub.com/spa3e and b
launch the One Quantitative Variable applet. O
2. E nter Travel time to work (in minutes) as the d
Variable name.
3. Select 1 as the number of groups and Raw data
as the input method.
4. Input the data. Be sure to separate the data val-
ues with commas or spaces as you type them.
5. C lick Begin analysis to display the summary
statistics.
Starnes_3e_CH01_002-093_v2.indd 64
John E. Kelly/Getty Imagesa
onald’s beef
sandwiches is
t.
ontext.
ith its 40 g of fat, is a possible
ndard deviation for the other 11
ained in Questions 1 through 3.
maries with Technology
TI-83/84
1. Type the values into list L1.
2. Calculate numerical summaries using
one-variable statistics.
dd Press STAT (CALC) and choose 1—Var Stats.
OS 2.55 or later: In the dialog box, press 2nd 1 (L1)
nd ENTER to specify L1 as the List. Leave FreqList
blank. Arrow down to Calculate and press ENTER .
Older OS: Press 2nd 1 (L1) and ENTER .
dd Press to see the rest of the one-variable
statistics.
29/03/16 9:11 pm
Lesson 1.7
W h a t D i d Yo u L e a r n ?
Learning Target
Find the range of a distribution of quantitative data.
Find and interpret the interquartile range.
Calculate and interpret the standard deviation.
Exercises Lesson 1.7
Mastering Concepts and Skills
1. Teens and shoes How many pairs of shoes does
typical teenage boy own? To find out, a group o
pg 59 statistics students surveyed a random sample of 2
male students from their large high school. The
they recorded the number of pairs of shoes tha
each boy owned. Here are the data, along with
dotplot. Find the range of the distribution.
14 7 6 5 12 38 8 7 10 10 10 11 4 5 22 7 5 10 35 7
d d d d dd
dd d
dd d
ddddd ddd
0 5 10 15 20 25 30 35 40
Shoes
2. Traveling Tarheels! Here are the travel times i
minutes for 15 workers in North Carolina, chose
at random by the Census Bureau, along with a dot
plot of the data. Find the range of the distribution
30 20 10 40 25 20 10 60 15 40 5 30 12 10 10
d d d
d d
ddd
d dd d d d d
0 5 10 15 20 25 30 35 40 45 50 55 60
Travel time to work (min)
Starnes_3e_CH01_002-093_v2.indd 65
L E S S O N 1.7 • Measuring Variability 65
Examples Exercises
p. 59 1–4
p. 61 5–8
p. 62 9–12
3. Heavy Cowboys The roster of the Dallas Cow-
boys professional football team in a recent season
a included 7 defensive linemen. Their weights (in
of pounds) were 321, 285, 300, 285, 286, 293, and
20
en 298. Find the range of the distribution.
at 4. Pizza and calories Here are data on the number of
a calories per serving for 16 brands of frozen cheese
pizza.52 Find the range of the distribution.
340 340 310 320 310 360 350 330
260 380 340 320 310 360 350 330
5. Shoes and teens Refer to Exercise 1. Find the inter-
pg 61 quartile range. Interpret this value in context.
6. Tarheels Refer to Exercise 2. Find the interquartile
in range. Interpret this value in context.
en 7. Cowboys Refer to Exercise 3. Find and interpret
t- the interquartile range.
n. 8. Frozen pizza Refer to Exercise 4. Find and interpret
the interquartile range.
9. Well rested? The first four students to arrive for a
first-period statistics class were asked how much
pg 62 sleep (to the nearest hour) they got last night. Their
responses were 7, 7, 8, and 10. Calculate the stan-
dard deviation. Interpret this value in context.
29/03/16 9:11 pm
66 C H A P T E R 1 • Analyzing One -Variable Data
10. The rate of metabolism A person’s metabolic rate (b
is the rate at which the body consumes energy. (c
Metabolic rate is important in studies of weight 15
gain, dieting, and exercise. Here are the metabolic
rates of 7 men who took part in a study of dieting. 16
(The units are calories per 24 hour. These are the
same calories used to describe the energy content of 17
foods.) Calculate the standard deviation. Interpret (a
this value in context. (b
18
1792 1666 1362 1614 1460 1867 1439 (a
11. Phosphate in blood The level of various substances (b
(c
in the blood influences our health. Here are mea-
surements of the level of phosphate in the blood of
a patient, in milligrams of phosphate per deciliter
of blood, made on 6 consecutive visits to a clinic.
Calculate and interpret the standard deviation.
5.6 5.2 4.6 4.9 5.7 6.4
12. Foot lengths Here are the foot lengths (in centime-
ters) for a random sample of seven 14-year-olds
from the United Kingdom. Calculate and inter-
pret the standard deviation.
25 22 20 25 24 24 28
Applying the Concepts
13. Varying fuel efficiency The dotplot shows the dif-
ference (Highway – City) in EPA mileage ratings
for each of 21 model year 2014 midsize cars.
d d 14
d
–4 –2 d
dd
dddd
ddddd
dddddd
0 2 4 6 8 10 12
Difference (Highway – City)
(a) Find the interquartile range of the distribution.
Interpret this value in context.
(b) Calculate and interpret the standard deviation.
(c) Which is the more appropriate measure of variabil-
ity for this distribution: the interquartile range or
the standard deviation? Justify your a nswer.
14. Another serving of carrots The dotplot shows
the weights (to the nearest gram) of 12 carrots
in a single bag from a local grocery store.
d d dd dd d dd
dd d
30 35 60 65 70
40 45 50 55
Weights (g)
(a) Find the interquartile range of the distribution.
Interpret this value in context.
Starnes_3e_CH01_002-093_v2.indd 66
a
b) Calculate and interpret the standard deviation.
c) Which is the more appropriate measure of variability
for this distribution: the interquartile range or the
standard deviation? Justify your answer.
5. Comparing SD Which of the following distribu-
tions has a larger standard deviation? Justify your
answer.
Variable A Variable B
25 25
20 20
Frequency 15 15
10 10
55
00 12345678
12345678
6. Comparing SD The parallel dotplots show the
lengths (in millimeters) of a sample of 11 nails pro-
duced by each of two machines.53 As mentioned on
page 60, both distributions have a range of 4 mm.
Which distribution has the larger standard devia-
tion? Justify your answer.
d
d
Machine d
Ad d d d d
d d d
d d
d
B d dd d
d ddd
72
68 69 70 71
Length (mm)
7. Properties of the standard deviation
a) Juan says that, if the standard deviation of a list is
zero, then all the numbers on the list are the same.
Is Juan correct? Explain your answer.
b) Letitia alleges that, if the means and standard de-
viations of two different lists of numbers are the
same, then all of the numbers in the two lists are
the same. Is Letitia correct? Explain your answer.
8. SD contest This is a standard deviation contest.
You must choose four numbers from the whole
numbers 0 to 10, with repeats allowed.
a) Choose four numbers that have the smallest pos-
sible standard deviation.
b) Choose four numbers that have the largest possible
standard deviation.
c) Is more than one choice possible in either part (a)
or (b)? Explain.
29/03/16 9:11 pm
L E S S O N 1.8 • Summa
Extending the Concepts
19. Estimating SD The dotplot shows the number of shot
per game taken by NHL player Sidney Crosby in hi
81 regular season games in a recent season.54 Is th
standard deviation of this distribution closest to 2, 5
or 10? Explain.
d
d
d
d
d
d
dd
dd
dd
dd
dd
dd
dd
ddd
ddd
ddddd
ddddd
dddddd
dddddd
ddddddd
dddddddd
ddddddddd
dddddddddd
dddddddddd
0 2 4 6 8 10
Number of shots
20. Will Joey pass? Joey’s first 14 quiz grades in
marking period had a mean of 85 and a standar
deviation of 8.
(a) Suppose Joey makes an 85 on the next quiz. Woul
the standard deviation of his 15 quiz scores b
greater than, equal to, or less than 8? Justify you
answer.
(b) Suppose instead that Joey has an unexcused ab
sence and makes a 0 on the next quiz. Woul
the standard deviation of his 15 quiz scores b
greater than, equal to, or less than 8? Justify you
answer.
Lesson 1.8
Summarizing Quan
Data: Boxplots and
Learning Targets
dd Use the 1.5 × IQR rule to identify outliers.
dd Make and interpret boxplots of quantitative dat
dd Compare distributions of quantitative data with
Starnes_3e_CH01_002-093_v2.indd 67
arizing Quantitative Data: Boxplots and Outliers 67
Recycle and Review
ts 21. Hurricanes (1.5, 1.6) The histogram shows the dis-
his tribution of the number of Atlantic hurricanes in
he every year from 1851 through 2012.55
5,
30
25
Frequency 20
15
10
5
0
2 4 6 8 10 12 14 16
Number of hurricanes
(a) Describe the shape of the distribution.
(b) Which would be a better measure of the typical
number of hurricanes per year: the mean or the
a median? Justify your answer.
rd 22. Salty nuggets (1.3, 1.4) The sodium content, in mil-
ligrams per 3-oz serving, for 22 brands of breaded
ld chicken nuggets and tenders are given here.56
be
ur 340 360 310 370 300 310 210 230 240 480 330
240 450 180 270 240 420 330 560 440 350 210
b- (a) Make a stemplot of these data using split stems.
ld (b) Make a dotplot of these data.
be
ur (c) Describe any features of the distribution that are
better illustrated by one graph than by the other.
ntitative
d Outliers
ta.
h boxplots.
29/03/16 9:11 pm
68 C H A P T E R 1 • Analyzing One -Variable Data
Barry Bonds set the major league
in 2001. On August 7, 2007, Bo
Hank Aaron’s longstanding record
retired, he had increased the total
that Bonds hit in each of his 21 co
dd d
0 10 20
Bonds’s 73 home run season s
Should this value be classified as a
Identifying Outliers
Besides serving as a measure of v
ruler for identifying outliers.
How to Identify Outlier
Call an observation an outlier if it fa
than 1.5 × IQR below the first quar
Low Outliers < Q1 – 1.5 × IQR
EXAMPLE
a Home run king?
Identifying outliers
PROBLEM: Here are data on the number of home runs that Bo
complete seasons. Identify any outliers in the distribution. Sho
16 25 24 19 33 25 34 46 37 33 42 40 37 34 49
SOLUTION:
16 19 24 25 25 26 28 33 33 34 34 37 37 40
Q1 = 25.5 Median
IQR = Q3 – Q1 = 45 – 25.5 = 19.5 F
Outliers < Q1 – 1.5 × IQR = 25.5 – 1.5 × 19.5 = –3.75 L
Outliers > Q3 + 1.5 × IQR = 45 + 1.5 × 19.5 = 74.25
C
Because there are no data values less than –3.75 or greater
than 74.25, this distribution has no outliers. B
n
FOR PRACTICE TRY EXERCISE 1. t
Starnes_3e_CH01_002-093_v2.indd 68
a
e record by hitting 73 home runs in a single season
onds hit his 756th career home run, which broke
d of 755. By the end of the 2007 season when Bonds
to 762. The dotplot shows the number of home runs
omplete seasons:
d dd d dd d
ddd d dd d d d dd d
30 40 50 60 70 80
Home runs
stands out (in red) from the rest of the distribution.
an outlier?
variability, the interquartile range (IQR) is used as a
rs: the 1.5 3 IQR Rule
alls more than 1.5 × IQR above the third quartile or more
rtile. That is,
High Outliers > Q3 + 1.5 × IQR
onds hit in each of his 21 © ZUMA Press, Inc./Alamy Stock Photo
ow your work.
73 46 45 45 26 28
42 45 45 46 46 49 73
Q3= 45
Find the interquartile range (IQR). Use the method of
Lesson 1.7.
Calculate the upper and lower cutoff values for outliers.
Barry Bonds’s record-setting year with 73 home runs is
not quite large enough to be classified as an outlier by
the 1.5 × IQR rule.
29/03/16 9:11 pm
L E S S O N 1.8 • Summa
It is important to identify outliers in a distributio
1. They might be inaccurate data values. Maybe so
instead of 101. Perhaps a measuring device brok
gave a silly response, like the student in a class s
30,000 minutes per night!
2. They can indicate a remarkable occurrence. For
golf earnings, Tiger Woods is likely to be an outl
3. They can heavily influence the values of some su
range, and standard deviation.
Making and Interpreting Boxplots
You can use a dotplot, stemplot, or histogram to disp
tive variable. Another graphical option for quantita
called a box-and-whisker plot). A boxplot summar
the location of 5 important values within the distrib
summary.
D E F I N I T I O N Five-number summary, Boxplot
The five-number summary of a distribution of quantit
mum, the first quartile Q1, the median, the third quartile
A boxplot is a visual representation of the five-number
How to Make a Boxplot
1. Find the five-number summary for the distribution
2. Draw and label the axis. Draw a horizontal axis and
variable underneath.
3. Scale the axis. Look at the smallest and largest valu
axis at a number equal to or below the smallest valu
intervals until you equal or exceed the largest value.
4. Draw a box that spans from the first quartile (Q1) to
5. Mark the median with a vertical line segment that’s
6. Identify outliers using the 1.5 × IQR rule.
7. Draw whiskers—lines that extend from the ends of
data values that are not outliers. Mark any outliers w
asterisk (*).
The top dotplot in the following figure shows
We have marked the first quartile, the median, a
lines. The process of testing for outliers with th
red. Because there are no outliers, we draw the
minimum data values, as shown in the finished
figure.
Starnes_3e_CH01_002-093_v2.indd 69
arizing Quantitative Data: Boxplots and Outliers 69
on for several reasons:
omeone recorded a value as 10.1
ke down. Or maybe someone
survey who claimed to study
example, in a graph of career
lier.
ummary statistics, like the mean,
play the distribution of a quantita-
ative data is a boxplot (sometimes
rizes a distribution by displaying
bution, known as its five-number
tative data consists of the mini-
e Q3, and the maximum.
r summary.
n.
d put the name of the quantitative
ues in the data set. Start the horizontal
ue and place tick marks at equal
.
the third quartile (Q3).
s the same height as the box.
f the box to the smallest and largest
with a special symbol such as an
s Barry Bonds’s home run data.
and the third quartile with blue
he 1.5 × IQR rule is shown in
whiskers to the maximum and
d boxplot at the bottom of the
29/03/16 9:11 pm
70 C H A P T E R 1 • Analyzing One -Variable Data
Lower cutoff
for outliers
1.5 x IQR = 29.25
Q1 =
dd d
dd
0 10 20
Q1 = 25
Min = 16
dd d
ddd
0 10 20
We see from the boxplot that
much smaller than the distance fr
side of the boxplot is more stretch
distribution is skewed to the right
EXAMPLE
a How big are the large fries?
Making and interpreting a boxplot
PROBLEM: Ryan and Brent were curious about the amount of
their favorite fast-food restaurant, Burger King. They went to se
days and ordered a total of 14 large fries. The weight of each or
165 163 160 159 166 152 166 168 173 1
(a) Make a boxplot to display the data.
(b) According to a nutrition website, Burger King’s large fries w
that their local Burger King restaurants may be skimping o
cion? Explain.
SOLUTION:
(a) 152 159 160 163 165 166 166 167 168 168 170 17
Min Q1 M ed = 166.5 Q3
Weight of large fries (g)
150 155 160 165 170 175
Weight of large fries (g)
Starnes_3e_CH01_002-093_v2.indd 70
a
IQR = 19.5 1.5 x IQR = 29.25 Upper cutoff
25.5 Med = 34 Q3 = 45 for outliers
dd d dd d
dd dd d d d dd d 70
30 40 50 60
Home runs
5.5 Med = 34 Q3 = 45
Max = 73
dd dd d d d
dd d d d dd
30 40 50 60 70
Home runs
t the distance from the minimum to the median is
rom the median to the maximum. That is, the right
hed out than the left side. So Barry Bonds’s home run
t.
f french fries they would get in a large order from
everal different Burger King locations over a series of
rder (in grams) is shown here.
171 168 167 170 170
weigh 160 grams, on average. Ryan and Brent suspect
on fries. Does the graph in part (a) support their suspi-
70 171 173 1. Find the five-number summary.
Max
2. Draw and label the axis.
3. Scale the axis.
29/03/16 9:11 pm
L E S S O N 1.8 • Summa
150 155 160 165 170 175
Weight of large fries (g)
150 155 160 165 170 175
Weight of large fries (g)
IQR = Q3 – Q1 = 170 – 163 = 7
Low Outliers < Q1 – 1.5 × IQR = 163 – 1.5 × 7 = 152.5
High Outliers > Q3 + 1.5 × IQR = 170 + 1.5 × 7 = 180.5
The order of large fries that weighed 152 grams is an outlier.
150 155 160 165 170 175
Weight of large fries (g)
(b) No. From the boxplot, Q1 = 163, so at least 75% of the la
Burger King restaurants weighed 163 grams or more. Onl
(159 grams) of large fries weighed less than 160 grams.
Boxplots provide a quick summary of the center
The median is displayed as a vertical line in the cent
the length of the box, and the range is the length of t
Boxplots do not display each individual value in a
show gaps, clusters, or peaks. For instance, the dotp
in minutes, of 220 eruptions of the Old Faithful gey
d urations is clearly bimodal (two-peaked). But a box
tant information about the shape of the distribution
d
ddd
d
d
ddd
d
d
d dd dd ddd
ddd ddd
d ddd
dd d d ddd
ddd d dd dd
d d dd ddd
d ddd d ddd
dd d d ddd d
ddd d d dd dd dd dd d d
d d d d dd d d d ddd
d ddd ddd d dd dd d d ddd
d d d d d d d
ddd ddd d d d d d d d dd dd d d
d d d d dd dd d d ddd d d ddd ddd
d d ddd ddd d d dd dd d d d
d d d d d d d d d d
ddd ddd ddd d d d d d d d d dd dd d d
d d d d d d dd dd dd d d d d d dd d dd d d ddd d d ddd dd dd
d d d d ddd ddd d d d d d d d d d d d dd dd d d d d
d d d d d d d ddddd dddddddddddddd
1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 1.5 2
Duration (min)
Starnes_3e_CH01_002-093_v2.indd 71
arizing Quantitative Data: Boxplots and Outliers 71
4. Draw a box that spans from the first quartile (Q1) to the
third quartile (Q3).
5. Mark the median with a vertical line segment that’s
the same height as the box.
6. Identify outliers.
5
7. Draw whiskers—lines that extend from the ends of
the box to the smallest and largest data values that are
not outliers. Mark any outliers with an asterisk.
arge fries that Ryan and Brent bought from local
ly the outlier (152 grams) and one other order
FOR PRACTICE TRY EXERCISE 5.
r and variability of a distribution. caution
tral box, the interquartile range is
the entire plot, including outliers. !
a distribution. And boxplots don’t
plot below displays the duration,
yser. The distribution of eruption
xplot of the data hides this impor-
n.
2.0 2.5 3.0 3.5 4.0 4.5 5.0
Duration (min)
29/03/16 9:11 pm
72 C H A P T E R 1 • Analyzing One -Variable Data
Comparing Distribution
Boxplots are especially effective fo
able in two or more groups.
EXAMPLE
a Who is doing the texting?
Comparing distributions with boxplots
PROBLEM: For their final project, a group of statistics student
females. They asked a random sample of students from their sc
and received over a two-day period. Here are their data.
Males 127 44 28 83 0 6 78 6
Females 112 203 102 54 379 305 179 24
Parallel boxplots of the data and numerical summaries are
males and females.
x sx Min Q1 Med Q3 Max IQR
Male 62.4 71.4 0 6 28 83 214 77
Female 128.3 116.0 0 34 107 191 379 157 Fe
SOLUTION:
Shape: Both distributions are strongly right-skewed.
Center: The females in the sample typically texted much
more over a two-day period (median = 107) than the males
did (median = 28). In fact, the median for the females is
above the third quartile for the males. This indicates that
at least 75% of the males texted less than the “typical”
(median) female.
Variability: There is much more variation in texting among
the females than the males. The IQR for females (157) is about
twice the IQR for males (77).
Outliers: There are two outliers in the male distribution:
students who reported 213 and 214 texts in two days. The
female distribution has no outliers.
Starnes_3e_CH01_002-093_v2.indd 72
a
ns with Boxplots
or comparing the distribution of a quantitative vari-
ts wanted to compare the texting habits of males and
chool to record the number of text messages sent
6 5 213 73 20 214 28 11
4 127 65 41 27 298 6 130 0
shown here. Compare the texting distributions for
Males **
emales
0 100 200 300 400
Number of text messages in 2-day period
Remember to compare shape, center, variability, and
outliers!
Due to the strong skewness and outliers, use the median
and IQR instead of the mean and standard deviation
when comparing center and variability.
FOR PRACTICE TRY EXERCISE 9.
29/03/16 9:11 pm
L E S S O N 1.8 • Summa
l e sso n A pp 1. 8
Which is best at reducing stress?
If you are a dog lover, having your dog with you may redu
level. Does having a friend with you reduce stress? To exam
pets and friends in stressful situations, researchers recruite
said they were dog lovers. Fifteen women were assigned
of three groups: to do a stressful task (1) alone, (2) with a g
ent, or (3) with their dogs present. The stressful task was to
by 13s or 17s. The woman’s average heart rate during the
measure of the effect of stress. The following table shows
62.6 70.9 73.3 75.5 77.8 80.4 84.5 84
Alone
84.9 87.2 87.4 87.8 90.0 91.8 99.0
76.9 80.3 81.6 83.4 87.0 88.0 89.8 91
Friend
92.5 97.0 98.2 99.7 100.9 101.1 102.2
58.7 64.2 65.4 68.9 69.2 69.2 69.5 70
Pet
70.1 72.3 76.0 79.7 85.0 86.4 97.5
TCOERCNEHR Making Boxplots with
You can use an applet or a graphing calculator
to make a boxplot. Let’s use technology to make
parallel boxplots of the male and female texting
data.
Applet
1. Go to highschool.bfwpub.com/spa3e and launc
the One Quantitative Variable applet.
2. E nter Number of texts as the Variable name.
3. S elect 2 as the number of groups and Raw data
as the input method.
4. N ame Group 1 “Male” and Group 2 “Female.”
Then enter the data for each group. Be sure to
separate the data values with spaces or comma
as you type them.
5. C lick Begin analysis. Parallel dotplots of the
data should appear. Change the graph type to
boxplot.
Starnes_3e_CH01_002-093_v2.indd 73
arizing Quantitative Data: Boxplots and Outliers 73
uce your stress Winnie Au/Getty Images
amine the effect of
ed 45 women who
at random to each
good friend pres-
o count backward
e task was one
s the data.57
4.7 1. Identify any outliers in the three groups. Show
your work.
1.4 2. Make parallel boxplots to compare the heart
rates of the women in the three groups.
0.2 3. Based on the data, does it appear that the pres-
ence of a pet or friend reduces heart rate during
a stressful task? Justify your answer.
Technology
ch
a
TI-83/84
1. Enter the texting data for males in list L1 and for
as females in list L2.
2. Set up two statistics plots: Plot1 to show a boxplot
of the male data and Plot2 to show a boxplot of
the female data. The setup for Plot1 is shown.
When you define Plot2, be sure to change L1 to L2.
Note: The calculator offers two types of boxplots: one
that shows outliers and one that doesn’t. We’ll
always use the type that identifies outliers.
29/03/16 9:11 pm
74 C H A P T E R 1 • Analyzing One -Variable Data
3. Press ZOOM and select ZoomStat to display the
parallel boxplots. Then press TRACE to view
the five-number summary.
Lesson 1.8
W h a t D i d Yo u L e a r n ?
Learning Target
Use the 1.5 × IQR rule to identify outliers.
Make and interpret boxplots of quantitative data.
Compare distributions of quantitative data with b
Exercises Lesson 1.8
Mastering Concepts and Skills 4.
1. Outlier Cowboys The roster of the Dallas Cow- 5.
boys professional football team in a recent season
pg 68 included 7 defensive linemen. Their weights (in pg 70
pounds) were 321, 285, 300, 285, 286, 293, and
298. Identify any outliers in the distribution. Show
your work.
2. Musical megabytes How much disk space
does your music use? Here are the file sizes (in
megabytes) for 18 randomly selected files on
Gabriel’s mp3 player. Identify any outliers in the
distribution.
2.4 2.7 1.6 1.3 6.2 1.3 5.6 1.1 2.2
1.9 2.1 4.4 4.7 3.0 1.9 2.5 7.5 5.0
3. Pizza with outliers The dotplot shows the num-
ber of calories per serving for 16 brands of frozen
cheese pizza.58 Identify any outliers in the distribu-
tion. Show your work.
dd
dddddd
d dddddd d
260 280 300 320 340 360 380 400
Calories
Starnes_3e_CH01_002-093_v2.indd 74
a
boxplots. Examples Exercises
p. 68 1–4
p. 70 5–8
p. 72 9–12
. Electoral College outliers To become president of
the United States, a candidate does not have to re-
ceive a majority of the popular vote. The candidate
does, however, have to win a majority of the 538
electoral votes that are cast in the Electoral College.
Here is a stemplot of the number of electoral votes
in 2016 for each of the 50 states and the District of
Columbia. Identify any outliers in the distribution.
Show your work.
0 3333333344444
0 55566666677788999
1 00001111234
1 5668
2 00
2 99
3
38
4 KEY: 1 | 5 is a state
with 15 electoral
4 votes.
5
55
. No need to call According to a study by Nielsen
0 Mobile, “Teenagers ages 13 to 17 are by far the
most prolific texters, sending or receiving 1742
messages a month.” Mr. Williams, a high school
29/03/16 9:11 pm
L E S S O N 1.8 • Summa
statistics teacher, was skeptical about this claim
So he collected data from his first-period statistic
class on the number of text messages and call
they had sent or received in the past 24 hours
Here are the data:
0 7 1 29 25 8 5 1 25 98 9 0 26
8 118 72 0 92 52 14 3 3 44 5 42
(a) Make a boxplot to display the data.
(b) Explain how the graph in part (a) gives evidence t
contradict the claim in the article.
6. Acing the first test Here are the scores of Mrs. Liao
students on their first statistics test:
93 93 87.5 91 94.5 72 96 95 93.5 93.5 73 82 45 88 80 86
85.5 87.5 81 78 86 89 92 91 98 85 82.5 88 94.5 43
(a) Make a boxplot to display the data.
(b) How did the students do on Mrs. Liao’s first test? Us
the graph from part (a) to help justify your answer.
7. Boxed Cowboys Refer to Exercise 1.
(a) Make a boxplot to display the data.
(b) Which measure of variability—the IQR or standar
deviation—would you report for these data? Use th
graph from part (a) to help justify your choice.
8. Variable memory Refer to Exercise 2.
(a) Make a boxplot to display the data.
(b) Which measure of variability—the IQR or stan
dard deviation—would you report for these data
Use the graph from part (a) to help justify you
choice.
9. Fat sandwiches, skinny sandwiches The followin
boxplots summarize data on the amount of fat (i
pg 72 grams) in 12 McDonald’s beef sandwiches and
McDonald’s chicken or fish sandwiches. Compar
the distributions of fat content for the two types o
sandwiches.
Beef Chicken *
or fish
0 10 20 30 40
Fat (g)
10. Get to work! The following boxplots summa
rize data on the travel times to work for 2
randomly chosen New Yorkers and 15 randoml
chosen North Carolinians. Compare the distri
butions of travel time for the workers in thes
two states.
Starnes_3e_CH01_002-093_v2.indd 75
arizing Quantitative Data: Boxplots and Outliers 75
m. NC
cs
ls
s. NY
*
0 20 40 60 80
Travel time to work (min)
11. Energetic refrigerators In its May 2010 edition,
Consumer Reports magazine rated different types
to of refrigerators, including those with bottom freez-
ers, those with top freezers, and those with side
o’s freezers. One of the variables they measured was
annual energy cost (in dollars). The following box-
plots show the energy cost distributions for each of
6 these types.
**Top Side Bottom
se
rd
he 40 60 80 100 120 140 160
Energy cost
(a) What percentage of bottom freezers cost more than
$60 per year to operate? What about side freezers
n- and top freezers?
a? (b) Compare the energy cost distributions for the three
ur types of refrigerators.
ng 12. Income in New England The following boxplots
in show the total income of 40 randomly chosen
9 households each from Connecticut, Maine, and
re Massachusetts, based on U.S. Census data from the
of American Community Survey for 2012.
Maine *
Massachusetts
*
Connecticut
0 50 100 150 200 250 300 350 400 450 500
Annual household income
($1000)
a- (a) Approximately what percentage of households
20 in the Maine sample had annual incomes below
ly $50,000? What about households in Massachu-
i- setts and Connecticut?
se (b) Compare the distributions of annual incomes in the
three states.
29/03/16 9:11 pm
76 C H A P T E R 1 • Analyzing One -Variable Data
Applying the Concepts (b
13. Text or talk? In a September 28, 2008, article titled, 16
“Letting Our Fingers Do the Talking,” the New
York Times reported that Americans now send M
more text messages than they make phone calls. F
Mr. Williams was curious about whether this claim (a
was valid for high school students. So he collect- (b
ed data from his first-period statistics class on the
number of text messages and calls they had sent or Ex
received in the past 24 hours. A boxplot of the dif-
ference (Texts – Calls) in the number of texts and 17
calls for each student is shown here. Do these data
support the claim in the article about texting versus
calling? Justify your answer.
* ** *
–20 0 20 40 60 80 100 120
Difference (Texts – Calls)
14. Alligator bites The Florida Fish and Wildlife Con-
servation Commission keeps track of unprovoked
attacks on people by alligators, defining “major”
attacks as those requiring hospital treatment or
(rarely) resulting in death and “minor” attacks as
those requiring, at most, first aid. A local tourist
bureau claims that most attacks are minor. A box-
plot of the difference (Major – Minor) in reported
number of attacks for each year from 1971 through
2013 is given here.59 Do these data support the
tourist bureau’s claim? Justify your answer.
** * *
−5 0 5 10
Major attacks – Minor attacks V
E
15. SSHA scores Higher scores on the Survey of Study v
Habits and Attitudes (SSHA) indicate good study (a
habits and attitudes toward learning. Here are
scores for 18 first-year college women. (b
154 109 137 115 152 140 154 178 101
103 126 126 137 165 165 129 200 148
And the scores for 20 first-year college men:
108 140 114 91 180 115 126 92 169 146
109 132 75 88 113 151 70 115 187 104
(a) Make parallel boxplots to compare the distributions.
Starnes_3e_CH01_002-093_v2.indd 76