Chapter 17: expected value and standard error for the sum of the draws
from a box
Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
When we do this 10,000 times... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Expected value and standard error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Expected value 5
Expected value for sum of the draws, method 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Expected value for sum of the draws, method 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Formula for expected value of sum of the draws. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Standard error 9
Standard error for the sum of the draws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Computing the SE for the sum of the draws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Example (cont’d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Example (cont’d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Short-cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Normal approximation 16
Use normal approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Example (cont’d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Example (cont’d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Classifying and counting 21
Replace tickets by 0s and 1s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1
Context
s We’ll look at sum of the draws of a box
s Example:
x Count the number of heads in 100 coin tosses
x Maybe one time the number is 54, the next time it is 48, the third time it is 47. The observed
value varies!
x Observed value = expected value + chance error
x See computer simulation, where I repeated this 10,000 times
2 / 22
When we do this 10,000 times...
Number of heads in 100 coin tosses, repeated 10000 times
0.08
0.06
Density
0.04
0.02
0.00
30 40 50 60
nr of heads
3 / 22
Expected value and standard error
s Note that the number of heads is a random variable, with a distribution!
s What is the center and spread of this distribution?
x The center is called the expected value
x The spread is called the standard error. The standard error gives the likely size of the chance error.
s We can use a similar model to analyze election polls, and will look into that later.
4 / 22
2
Expected value 5 / 22
Expected value for sum of the draws, method 1 6 / 22
s We look at the sum of 100 draws from a box with the tickets 0, 1, 1, 6
s Observed value = expected value + chance error
s What is the expected value of the sum of the draws?
s Method 1:
x How many 0’s do we expect in our draws? About 25.
x How many 1’s do we expect in our draws? About 50.
x How many 6’s do we expect in our draws? About 25.
x So what do we expect for the sum of the draws? About
(25 × 0) + (50 × 1) + (25 × 6) = 0 + 50 + 150 = 200
Expected value for sum of the draws, method 2
s Method 2:
x The average of the box is: 0 + 1 + 1 + 6 8
4 4
= = 2
x So after each draw, we expect the sum of the draws to increase by about 2
x So the sum of the draws is expected to be 100 × 2 = 200
x General formula for the expected value for the sum of the draws, made at random with
replacement:
(number of draws) × (average of the box)
7 / 22
Formula for expected value of sum of the draws
s General formula for the expected value for the sum of the draws, made at random with replacement:
(number of draws) × (average of the box)
s Does the formula make sense?
x What happens if the number of draws is doubled? Then the expected value of the sum of the
draws doubles.
x What happens if the average of the box is doubled? Then the expected value of the sum of the
draws doubles.
8 / 22
3
Standard error 9 / 22
Standard error for the sum of the draws
s We look at the sum of draws from a box
s Observed value = expected value + chance error
s How big is the chance error? The chance error is likely to be similar in size to the standard error (SE)
for the sum of the draws
s If the SE for the sum of the draws is large, then we have large chance errors, and the observed values
are widely spread around the expected value
s If the SE for the sum of the draws is small, then we have small chance errors, and the observed values
are tightly clustered around the expected value
s Observed values are rarely more than 2 or 3 SEs away from the expected value.
10 / 22
Computing the SE for the sum of the draws
√
s SE for the sum of the draws = number of draws × (SD of the box)
s This is called the square root law, because it involves the square root of the number of draws
s Does the formula make sense?
x What happens if the n√umber of draws is doubled? Then the SE of the sum of the draws is
multiplied by a factor 2. This matches with what we learned about the law of large numbers:
the chance error grows, but only slowly.
x What happens if we double the SD of the box? Then the SE of the sum of the draws doubles.
11 / 22
Example
s We look at the sum of 25 draws from a box with tickets 0,2,3,4,6
s Fill in the blank. The sum of the draws is around ...(a), give or take ...(b) or so.
s (a) should be the expected value of the sum of the draws:
(number of draws) × (average of the box)
= 25 × 0+2+3+4+6 = 25 × 3 = 75
5
s (b) should be the SE for the sum of t√he draws.
This is given by the square root law: number of draws × (SD of the box)
12 / 22
4
Example (cont’d)
s W√e need to compute the SE for the sum of the draws:
number of draws × (SD of the box)
s What is the SD of the box 0, 2, 3, 4, 6?
x Step 1: compute the average of the box: 3 (see part a)
x Step 2: compute deviation from the average:
-3, -1, 0, 1, 3
x Step 3: compute r.m.s. size of the deviations:
(−3)2 + (−1)2 + 02 + 12 + 32 = 20 = √ = 2
5 5 4
x So the SD of the box is 2
s √The SE for the sum of the draws is:
25 × 2 = 5 × 2 = 10.
13 / 22
Example (cont’d)
s We look at the sum of 25 draws from a box with tickets 0,2,3,4,6
s Fill in the blank. The sum of the draws is around ...(a), give or take ...(b) or so.
s (a) should be the expected value of the sum of the draws: 75
s (b) should be the SE for the sum of the draws: 10
s So the sum of the draws is around 75, give or take 10 or so.
14 / 22
Short-cut
s Suppose the box only contains two kinds of tickets: some tickets with a big number and some tickets
with a small number. Then there is a shortcut to compute the SD of the box!
s SD of the box =
(big number − small number)
× (fraction of big numbers) × (fraction of small numbers)
s Example: box with tickets 7,7,7,-2,-2
x Large number = 7. Fraction of large numbers = 3/5.
x Small number = -2. Fraction of small numbers = 2/5.
x SD of the box = (7 − (−2)) × (3/5) × (2/5) = 9 × (3/5) × (2/5)
x Use calculator to compute this
15 / 22
5
Normal approximation 16 / 22
Use normal approximation
s If number of draws is large, we can use the normal approximation to estimate chances.
s We should use a new average and new SD:
x New average = expected value for sum of the draws
x New SD = SE for the sum of the draws
x So the new standard units tell us how many SEs a number is away from the expected value
17 / 22
Example
s Consider the sum of 25 draws from the box with tickets 0,2,3,4,6.
s See computer simulation, where I repeated this 1000 times
18 / 22
Example (cont’d)
Histogram of sum of the draws, when repeated 1000 times
0.04
0.03
Density
0.02
0.01
0.00
40 50 60 70 80 90 100 110 19 / 22
sum of the draws
6
Example (cont’d)
s About what percentage of observed values should be between 50 and 100?
s We use the normal approximation:
x New average: expected value for the sum of the draws = 75
x New SD: SE for the sum of the draws = 10
s Note that these numbers match with the graph on the previous slide.
s Then use normal approximation as before. See overhead
20 / 22
Classifying and counting 21 / 22
Replace tickets by 0s and 1s
s See overhead for example
s Suppose you draw from a box, and want to count the number of a certain ticket (or tickets)
s Then:
x put a 0 on the tickets that you don’t want to count
x put a 1 on the ticket that you do want to count
s Using the new box:
x The count is like the sum of the draws from the new box
x We can compute the expected value and SE as before
x We can also use the normal curve to approximate probabilities as before
22 / 22
7