Topic 5 47
CORRELATION AND REGRESSION
LINEAR REGRESSION
❖ Regression analysis fits the best line to the observed data and allows us to make
predictions about one variable from the values of the other.
❖ One variable (the independent variable) is assumed to predict the other (the
dependent), the results are not the same if we swap the variables.
❖ The values of the independent variable may be selected.
❖ There are other assumptions and requirements of a regression analysis. (The
relationship is approximately linear; the residuals have to be normally distributed etc.)
❖ Regression analysis is best carried out under the guidance of a statistician.
LINEAR REGRESSION EQUATION LEAST-SQUARES METHOD
Linear Regression equation:
y = a + bx
∑y -b ∑x a = the y-intercept
a= n n b = the slope
b= n ∑ xy – (∑x)(∑y) Where:
n ∑x2 – (∑x)2 n = number of data
x = independent variables
y = dependent variables
48 Topic 5
CORRELATION AND REGRESSION
Example:
MAF Company has collected data on its annual sales and its annual advertising expenses.
The result is given in table below. Predict the sales if advertising expenses are RM20000.
Advertising expenses, x (RM’000) 3 5 8 9 11 14 15
Sales, y (RM million) 9 13 18 21 23 29 33
Solution:
Advertising Sales, y x2 xy
expenses, x
9 9 27
3 13 25 65
5 18 64 144
8 21 81 189
9 23 121 253
11 29 196 406
14 33 225 495
15 ∑ = 146 ∑ = 721 ∑ = 1579
∑ = 65
n ∑ xy – (∑x)(∑y)
b = n ∑x2 – (∑x)2
7 (1579) – (65)(146)
b = 7(721) – (65)2
b= 11053 – 9490
5047 – 4225
b= 1563
822
b = 1.90
Topic 5 49
CORRELATION AND REGRESSION
a= ∑y -b ∑x
n n
a= 146 - 1.9 65
7 7
a = 20.86 - 1.9 (9.29)
a = 20.86 - 17.651
a = 3.2
Linear equation, y = 3.2 + 1.9x
Predict the sales if advertising expenses are RM20000.
y = 3.2 + 1.9x
y = 3.2 + 1.9 (20)
y = 3.2 + 38
y = 41.2
** Sales are estimated at RM41.2 million if the advertising expenses
are RM20000.
Learning Outcomes:
At the end of this topic, you should be able to
1. Ascertain sample spaces and probability
2. Apply additional rules for probability
3. Apply multiplication rules for probability
Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
51 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
Probability can be defined as the chance of an event occurring. Probabilities are being used
not only in games, but also used in the field of insurance, investments, weather forecasting,
and various areas.
The concept of probability include probability experiments, sample spaces, the addition and
multiplication rules and probabilities of complementary events.
PROBABILITY CONCEPT
A probability experiment is a chance process that leads to well-defined results called
outcomes.
An outcome is the result of a single trial of a probability experiment.
A sample space is the set of all possible outcomes of a probability experiment.
An event consists of a set of outcomes of a probability experiment.
TREE DIAGRAM
Tree diagrams are useful for organizing and visualizing the different possible outcomes. It is a
simply way of representing a sequence of events. Tree diagrams record all possible
outcomes in a clear and uncomplicated manner.
The first event is represented by a dot. From the dot, branches are drawn to represent all
possible outcomes of the event. The probability of each outcome is written on its branch.
Topic 6 52
ELEMENTARY OF PROBABILITY CONCEPTS
Example 1:
If it rains on a given day, the probability that it rains the next day is (1) . It does not rain on a
3
given day, the probability that it rains the next day is (61). The probability that it will rain
tomorrow is (51). What is the probability that it will rain the day after tomorrow? Draw a tree
diagram of all the possibilities to determine the answer.
Probability that it will rain the day after tomorrow:
P (Rain after tomorrow) = P (R,R) + P (R’,R)
=1+ 2= 3
15 15 15
= 0.2
Exercise 1:
A student feels that there is an 80% chance of getting an A in Mathematics but only 50%
chance of getting an A in Statistics. However, the student feels that taking the statistics
course is twice as important as taking the Mathematics course. By drawing the tree
diagram, shows the probability of student’s chances of getting A’s for both courses.
53 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
Exercise 2:
A bottle contains 12 identical pills, of which 4 are Panadol. The other 8 pills are sugar placebo
with no medical value. Remy selects a pill at random and swallow it. If the pill that Remy
takes is a Panadol, the probability of relief from headache is 0.9. If the sugar placebo is taken,
the probability of relief is 0.2. Find the probability that Remy is relieved of the headache.
Draw a tree diagram to show the probability.
Exercise 3:
A bag contains 3 black balls and 5 white balls. Paul picks a ball at random from the bag and
replaces it back in the bag. He mixes the balls in the bag and then picks another ball at
random from the bag.
a) Construct a probability tree of the problem.
b) Calculate the probability that Paul picks:
i) Two black balls
ii) A black ball in his second draw
Exercise 4:
Bag A contains 10 marbles of which 2 are red and 8 are black. Bag B contains 12 marbles of
which 4 are red and 8 are black. A ball is drawn at random from each bag.
a) Draw a probability tree diagram to show all the outcomes the experiment.
b) Find the probability that:
i. both are red.
ii. both are black.
iii. one black and one red.
iv. at least one red.
Exercise 5:
12 painted spot in a set four numbers of blue, two numbers of white, five numbers of red.
One should be taken and color noted, go back to set, then a second taken. What is the
probability using tree diagram?
Topic 6 54
ELEMENTARY OF PROBABILITY CONCEPTS
VENN DIAGRAM: Union, intersection and subset.
Venn diagram is a graphical tool used to sort a sample space into meaningful groups which
may assist in calculating the probability of a specific outcome.
Suppose A and B are sets,
1. The union of two events A and B, denoted by A ∪ B, and read “A or B” is the event
consisting of all elements that are either in A, in B, or in both.
2. In a set, A is a subset of B if all elements of A belongs to B. Subset is denoted by as A ⊂ B.
3. The intersection of sets A and B is the set containing the unique elements from both set A
and set B. In other words, to create an intersection, only select elements found in both
original sets. We write A ∩ B for the set.
55 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
Example 1:
From a 20 students, 10 like volleyball, 6 like basketball and 2 like both. Draw a Venn
diagram.
Find the probability of:
a. P (like V)
=
b. P (like V or B)
= P(V) + P(B) + P(V ∩ B)
= 8 + 4 + 2
20 20 20
=
c. P (like V and B)
= P(V ∩ B)
=
d. P (don’t like V and B)
= 1 - P(V ∩ B)
=1- 2
20
=
Topic 6 56
ELEMENTARY OF PROBABILITY CONCEPTS
Exercise 1:
In class of 50 students, 30 take Mathematics, 23 take Statistics and 12 take both. Draw a
venn diagram. What is the probability that a randomly selected student:
a. Takes Statistics
b. Takes both Mathematics and Statistics
c. Does not take Mathematics
d. Takes Statistics but not Mathematics
e. Takes Mathematics given Statistics
f. Takes Mathematics or Statistics
g. Takes neither Mathematics nor Statistics
h. Takes Statistics or Mathematics but not both
57 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
PROBABILITY OF EVENTS USING TWO-WAY TABLE
A two-way table of counts (or frequencies) organizes data about two categorical variables
taken from the same individuals or subjects. Let say we conduct a survey and ask each
person two questions, then we will have two pieces of data from each person. Whenever we
have two pieces of data from each person, we can organize the data into a two-way table.
Two-Way Table Terminology
• Joint Frequency – each entry in the table
• Marginal Frequency – the sum of the joint relative frequencies in a row or column. ums
of the rows and columns
Female Netball Valleyball Basketball Total
Male 15 20 13 48
Total 16 13 23 52
31 33 36 100
Joint Frequency
Marginal Frequency
Basic Probability Notation
P(A) = probability of event A
• Marginal Frequency of event A/Total Frequency
• Example: P(Female) = 48/100
P(AՈB) = probability of event A and event B (happening at the same time)
• Joint Frequency of events A and B/Total Frequency
• Example: P(FemaleՈNetball) = 15/100
P(AՍB) = probability of event A or B (but not both)
• Marginal Relative Frequency of event A + Marginal Relative Frequency of event
B – Joint Relative Frequency of A and B
• Example: P(FemaleՍNetball) = (48/100)+(31/100)-(15/100) = 64/100
P(A|B) = probability of event A given that event B has already occurred
• Joint Frequency of event A/Marginal Frequency of event B
• Example: P(Female|Netball) = 15/31
Topic 6 58
ELEMENTARY OF PROBABILITY CONCEPTS
Example 1:
There is a 1000 population of students at high school. Fill the cell with the figure that
represents all the data in the high school with the information given below.
a. The probability that a randomly selected student at this high school has asthma is
. .
b. The probability that a randomly selected student has at least household member
who smokes is . .
c. In addition to the previously given probabilities, the probability that a randomly
selected student has at least one household member who smokes and has asthma is
. .
Based on the complete two-way table, estimate the following probabilities.
a. A randomly selected student has asthma. What is the probability this student has at
least household member who smokes? Ans: (120 / 193)
b. A randomly selected student does not have asthma. What is the probability this
student has at least one household member who smokes? Ans: (301/807)
c. A randomly selected student has at least one household member who smokes.
What is the probability this student has asthma? Ans: (120/421)
59 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
Example 2:
The school students in a town were surveyed and classified according to gender and
response to the question “how do you usually get to school”? The data is
summarized in the two-way table below.
Male Bus Car TOTAL
Female 30 120 150
TOTAL 25 170 195
55 290 345
a. Find the probability that the student was chose to go to school by bus.
= P(Bus)
=
b. Find the probability that the student was female.
= P(Female)
=
c. Find the probability that the student was male, given the student chose to go school
by car.
P(M │C) = P(M and C) / P(C)
= ( 120 ) ÷ (290 )
345 345
= @ 0.414
d. Find the probability that the student chose to go to school by car, given the student
was male.
P(C │M) = P(C and M) / P(M)
= ( 120 ) ÷ (150 )
345 345
= @ 0.8
Topic 6 60
ELEMENTARY OF PROBABILITY CONCEPTS
Exercise 1:
The school board of Waldo, a rural town in the Midwest, is considering building a new high
school primarily funded by local taxes. They decided to interview eligible voters to
determine if the school board should build a new high school facility to replace the current
high school building. There is only one high school in the town. Every registered voter in
Waldo was interviewed. The data from these interviews are summarized below.
Yes No No Answer Total
Male 119 6 241
Female
Total 230 12 515
a) Complete the following two-way frequency table.
b) If a randomly selected eligible voter is female, what is the probability she will vote to
build a new high school?
c) If a randomly selected eligible voter is male, what is the probability he will vote to
build a new high school?
61 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
Exercise 2:
A class of 19 students, 8 of which are boys are asked whether they have school lunch or
not. 4 girkd take a school lunch and 5 boys do not take a school lunch. Draw up a two-
way table to illustrate the results. A student is chosen at random, calculate the following
probabilities.
Lunch Do not Lunch TOTAL
19
Boy
Girl
TOTAL
a. A boy and takes a school lunch.
b. A girl and does not take a school lunch.
c. Given that a boy is chosen, he takes a lunch.
d. Given a girl is chosen, she does not take a lunch.
Exercise 3:
In a recent trial to test the effectiveness of a drug curing a disease on a sample of 10
men and 15 women. Th following results were recorded. 3 women who took the drug
still had the disease after taking it whilst 2 men still had the disease after taking the
drug. Draw up a two way table to illustrate these results. A sample is chosen at random,
calculate the following probabilities.
Men Disease No disease TOTAL
Women 2 8 10
TOTAL 3 12 15
5 20 25
a. A male and cured by the drug.
b. A female and cured by the drug.
c. Given that a male was chosen, they were cured by the drug.
Topic 6 62
ELEMENTARY OF PROBABILITY CONCEPTS
PROBABILITY OF EVENTS USING CLASSICAL FORMULA
Classical probability is the statistical concept that measures the likelihood of something
happening, which means that every statistical experiment will contain elements that are
equally likely to happen. For example, if we were to flip a coin, there's an equal chance of it
landing on "heads" or "tails. Same goes to a dice where it has equally probable that will land
on any of the 6 numbers on the die; 1, 2, 3, 4, 5, or 6.
Notes:
S is a sample space which a set of all possible outcomes of the experiment.
o n(S) is the number of possible outcomes.
A is an event can happen.
o n(A) is the number of times the event can happen.
P(A) is the probability of event A. P(A) = n(A) / n(S)
o P(A) = 0 means that an event will never happen.
o Ex: when a single die is rolled, the probability of getting a 9 is 0.
o P(A) = 1 means that an event is certain to happen.
o Ex: when a single die is rolled, the probability of getting number less than 7 is 1.
The range of values of probability of an event is 0 to 1.
Example 1:
Find the probability exactly two heads in three tosses of a fair coin.
S = {HHH, HHT, HTH, THH, TTH, THT, HTT, TTT}
n(s) = 8
A(two heads) = {HHT, HTH, THH}
n(A) = 3
P(two heads) = 3
8
63 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
ASCERTAIN THE COMPLEMENTARY EVENTS
Two events are said to be complementary when one event occurs if and only if the other
does not. The probabilities of two complimentary events add up to 1.
For example, rolling a 5 or greater and rolling a 4 or less on a die are complementary events,
because a roll is 5 or greater if and only if it is not 4 or less. The probability of rolling a 5 or
greater is 2/6 = 1/3 , and the probability of rolling a 4 or less is 4/6 = 2/3 . Thus, the total of
their probabilities is 1/3 + 2/3 = 3/3 = 1.
An event is usually denoted by A and the complimentary is often denoted A’. For example, If
our event A is “it rains today,” then the complement, A’, is the event “it doesn’t rain today”.
Event A and its complement A’ are mutually exclusive because the two events cannot occur
at the same time.
Example 1:
If the probability of an event is , what is the probability of its complement?
The probability of complement is,
= 1 – 38
= 88 – 38
=
Example 2:
If the probability that a person lives in an industrialized country of the world is 1/5, find the
probability that a person does not live in an industrialized country.
P(A’) = 1 – 15
=
Topic 6 64
ELEMENTARY OF PROBABILITY CONCEPTS
Exercise 1:
Ben Sofia Roy Elle Zack Mia
a. Find P(Ben is not chosen)
b. Find P(not a girl)
c. Find P(the 1st letter is not M)
Exercise 2:
0.2 0.5 0.3
a. Find P(not an apple)
b. Find P(not a fruit)
c. Find P(the 1st letter is not O)
65 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
ADDITIONAL RULE
Mutually exclusive events vs non-mutually exclusive events.
Two events are mutually exclusive if they cannot occur at the same time. Another word that
means mutually exclusive is disjoint. If two events are mutually exclusive, then the
probability of either occurring is the sum of the probabilities of each occurring.
In events which are not mutually exclusive, there is some overlap. When P(A) and P(B)
are added, the probability of the intersection (and) is added twice. To compensate for
that double addition, the intersection needs to be subtracted.
Topic 6 66
ELEMENTARY OF PROBABILITY CONCEPTS
Additional Rule 1
Addition Rule 1: When two events, A and B, are mutually exclusive, the probability that A or
B will occur is the sum of the probability of each event.
P(A or B) = P(A) + P(B)
Example 1:
A box contains 3 glazed doughnuts, 4 jelly doughnuts and 5 chocolate doughnuts. If a person
selects one doughnut at random, find the probability that it is either a glazed doughnut or a
chocolate doughnut.
Solution:
Since the box contains 3 glazed doughnuts and 5 chocolate doughnuts, and a total of 12
doughnuts, so
P (glazed or chocolate) = P (glazed) + P (chocolate)
= 3 + 5
12 12
= 8
12
= 2
3
Example 2:
A day of the week is selected at random. Find the probability that it is a weekend day.
Solution:
P (Saturday or Sunday) = P (Saturday) + P (Sunday)
= 1 + 1
7 7
= 2
7
67 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
Additional Rule 2
Additional Rule 2: If A and B are not mutually exclusive, then
P(A or B) = P(A) + P(B) – P(A and B)
Example 1:
In a hospital unit there are 8 nurses and 5 physicians; 7 nurses and 3 physicians are
females. If a staff person is selected, find the probability that the subject is a nurse or a
male.
Solution:
The sample space is shown here.
staff female male total
Nurses 7 1 8
Physicians 3 2 5
10 3 13
total
The probability is,
P (nurse or male) = P (nurse) + P (male) – P (male nurse)
= 8 + 3 + 1
13 13 13
= 10
13
Topic 6 68
ELEMENTARY OF PROBABILITY CONCEPTS
Example 2:
One hundred students were surveyed about their preference between rabbits and cats. The
following two-way table displays data for the sample of students who responded to the
survey. Find the probability that a randomly selected student prefers rabbits or is female.
Preferences male female Total
Prefers rabbits 36 20 56
10 26 36
Prefers cats 2 6 8
No preference 48 52 100
Total = P (rabbits) + P (female) – P (female that prefer rabbits)
Solution:
P (rabbits or female)
= 56 + 52 + 20
100 100 100
= 88
100
Exercise 1:
Of 200 members in a club, 98 woman. 34 members of the club wear glasses and 20 of
them are women. If a member of the club is chosen at random, what is the probability
that the person wears glasses or is a woman (or both)?
Exercise 2:
A school has 20 male teachers and 40 female teachers. Half of the male teachers and
half of female teachers are graduates from local universities. Find the probability that
a teacher chosen at random is a male teacher or a local graduate.
69 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
MULTIPLICATION RULE
Independent events vs dependent events
If the events are independent, one happening doesn't impact the probability of the other. In
that case, P (B │A) = P (B).
Some other examples of independent events are:
• Landing on heads after tossing a coin AND rolling a 5 on a single 6-sided die.
• Choosing a marble from a jar AND landing on heads after tossing a coin.
• Choosing a 3 from a deck of cards, replacing it, AND then choosing an ace as the second
card.
• Rolling a 4 on a single 6-sided die, AND then rolling a 1 on a second roll of the die.
• Taking an Uber ride and getting a free meal at your favorite restaurant
• Growing the perfect tomato and owning a cat
Two events are dependent if the outcome or occurrence of the first affects the outcome or
occurrence of the second so that the probability is changed.
Some other examples of dependent events are:
• Getting into a traffic accident is dependent upon driving or riding in a vehicle.
• If you park your vehicle illegally, you’re more likely to get a parking ticket.
• You must buy a lottery ticket to have a chance at winning; your odds of winning are
increased if you buy more than one ticket.
• Committing a serious crime – such as breaking into someone’s home – increases your
odds of getting caught and going to jail.
• Robbing a bank and going to jail.
• Not paying your power bill on time and having your power cut off.
Topic 6 70
ELEMENTARY OF PROBABILITY CONCEPTS
Multiplication Rule 1
Multiplication Rule 1: When two events, A and B, are independent, the probability of both
occurring is:
P(A and B) = P(A) · P(B)
Example 1:
A dresser drawer contains one pair of socks with each of the following colors: blue, brown,
red, white and black. Each pair is folded together in a matching set. You reach into the sock
drawer and choose a pair of socks without looking. You replace this pair and then choose
another pair of socks. What is the probability that you will choose the red pair of socks both
times?
Solution: = 1
P (red) 5
P (red and red)
= P (red) ⦁ P (red)
= 1 ⦁ 1
5 5
= 1
25
Example 2:
A jar contains 3 red, 5 green, 2 blue and 6 yellow marbles. A marble is chosen at random
from the jar. After replacing it, a second marble is chosen. What is the probability of
choosing a green and then a yellow marble?
Solution:
P (green and yellow) = P (green) ⦁ P (yellow)
= 5 ⦁ 6
16 16
= 30
256
= 15
128
71 Topic 6
ELEMENTARY OF PROBABILITY CONCEPTS
Multiplication Rule 2
Multiplication Rule 2: When two events, A and B, are dependent, the probability of both
occurring is:
P(A and B) = P(A) · P(B|A)
Example 1:
Mr. Azad needs two students to help him with a science demonstration for his class of 18
girls and 12 boys. He randomly chooses one student who comes to the front of the room. He
then chooses a second student from those still seated. What is the probability that both
students chosen are girls?
Solution:
P(Girl 1 and Girl 2) = P(Girl 1) and P(Girl 2|Girl 1)
= 18 ⦁ 17
30 29
= 306
870
= 51
145
Example 2:
In a shipment of 20 computers, 3 are defective. Three computers are randomly selected and
tested. What is the probability that all three are defective if the first and second ones are not
replaced after being tested?
Solution:
P(3 defects) = P (defect 1) and P (Defect 2|defect 1) and P (Defective 3|defective 2)
= P (D1) ⦁ P (D2│D1) ⦁ P (D3│D2)
= 3 ⦁ 2 ⦁ 1
20 19 18
= 6
6840
= 1
1140
Learning Outcomes:
At the end of this topic, you should be able to
1. Explain the concept of Estimation Theory
2. Initiate hypothesis testing
Topic 7
ESTIMATION AND HYPOTHESIS TESTING
73 Topic 7
ESTIMATION AND HYPOTHESIS TESTING
ESTIMATION THEORY
Estimation theory is a branch of statistics that deals with estimating the values of parameters
based on measured empirical data that has a random component. The parameters describe
an underlying physical setting in such a way that their value affects the distribution of the
measured data.
Statistical estimation is the procedure of using a sample statistics to estimate population
parameter.
For example, sample mean is 7.65 is an estimator of population mean.
Statistical estimation can be divided into two categories:
1.Point estimation
•A single statistic is used to provide an estimate of the population parameter.
2.Interval estimation
•Range of values within with researcher can say with some confidence that
population parameter falls.
•The range is called confidence interval.
Formula for confidence interval of the mean for a specific α.
for a 90% confidence interval, zα/2 = 1.65; for a 95% confidence interval, zα/2 = 1.96; and for
99% confidence interval, zα/2 = 2.58.
Topic 7 74
ESTIMATION AND HYPOTHESIS TESTING
The term zα/2 ( /√n) is called the maximum error of estimate. For a specific value, say, α = 0.05,
95% of the sample means will fall within this error value on either side of the population mean.
Example 1: the president of a large university wishes to estimate the average age of the
students presently enrolled. From last studies, the standard deviation is known to be 2 years. A
sample of 50 students is selected, and the mean is found to be 23.3 years. Fin the 95%
confidence interval of the population mean.
Solution: since the 95% confident interval is desired, zα/2 = 1.96. hence, substituting in the
formula:
23.2 – 1.96 (2/√50) < < 23.2 + 1.96 (2/√50)
23.2 – 0.6 < < 23.2 + 0.6
22.6 < < 23.8
Hence, the president can say, with 95% confidence, that the average age of the students is
between 22.6 and 23.8 years, based on 50 students.
Example 2: a survey of 30 adults found that the mean age of a person’s primary vehicle is 5.6
years. Assuming the standard deviation of the population is 0.8 year, find the 99% confidence
interval of the population mean.
Solution:
5.6 – 2.58 (0.8/√30) < < 5.6 + 2.58 (0.8/√30)
5.6 – 0.38 < < 5.6 + 0.38
5.22 < < 5.98
Hence, one can be 99% confident that the mean age of all primary vehicles is between 5.22
years and 5.98 years, based on 30 vehicles.
75 Topic 7
ESTIMATION AND HYPOTHESIS TESTING
HYPOTHESIS TESTING
A hypothesis can be defined as a logically conjectured relationship between two or more
variables expresses in the form of a testable statement. Also known as significance testing. By
testing the hypothesis and confirming the conjectured relationship, it is expected that solutions
can be found to correct the problem encountered.
TYPES OF HYPOTHESIS
There are two types of hypothesis for each situation, which are:
1.Null hypothesis:
symbolized by H0. It is expressed as no difference between a parameter and specific value, or
there is no difference between two parameters.
2.Alternative hypothesis:
symbolized by H1 / HA. It is expressed as the existence of a difference between a parameter and
a specific value, or state that there is a difference between two parameters.
To state hypothesis correctly, researchers must translate the conjecture or claim from words into
mathematical symbols. The basic symbols used are as follows:
Equal to =
Not equal to ≠
Greater than >
Less than <
Greater than or equal to ≥
Less than or equal to ≤
The null and alternative hypotheses are stated together, and the null hypothesis contains the
equals’ sign, as shown (where k represents a specified number).
Two-tailed test Right-tailed test Left-tailed test
H0 : μ = k H0 : μ ≤ k H0 : μ ≥ k
H1 : μ ≠ k H1 : μ > k H1 : μ < k
Topic 7 76
ESTIMATION AND HYPOTHESIS TESTING
a) Left-tailed
b) Right-tailed
c) Two-tailed
77 Topic 7
ESTIMATION AND HYPOTHESIS TESTING
Example 1:
A biologist was interested in determining whether sunflower seedlings treated with an
extract from Vinca minor roots resulted in a lower average height of sunflower seedlings than
the standard height of 15.7 cm. The biologist treated a random sample of n = 33 seedlings
with the extract and subsequently obtained the following heights:
The biologist's hypotheses are:
H0 : μ = 15.7
H1 : μ < 15.7
Example 2:
A medical researcher is interested in finding out whether a new medication will have any
undesirable side effects. The researcher is particularly concerned with the pulse rate of the
patients who take the medication. The mean pulse rate for the population under study is 82
beats per minute. Will the pulse rate increase, decrease, or remain unchanged after a patient
takes the medication?
H0 : μ = 82
H1 : μ ≠ 82
Example 3:
A chemist invents an additive to increase the life of an automobile battery. If the mean
lifetime of the automobile battery is 36 months. The hypothesis are:
H0 : μ ≤ 36
H1 : μ > 36
Example 4:
A contractor wishes to lower heating bills by using a special type of insulation in house. If the
average of the monthly heating bills is RM78. The hypothesis about heating costs with the
use of insulation are:
H0 : μ ≥ 78
H1 : μ < 78
Topic 7 78
ESTIMATION AND HYPOTHESIS TESTING
ELEMENTS FOR TESTING A HYPOTHESIS
1.Choose the population characteristic of interest (e.g. μ the population mean)
2.Choose a significance level for the test (e.g. α = .05, that is there is a 5% chance of
incorrectly rejecting the null hypothesis - also called Type I error)
3.State the Null Hypothesis (Ho:)
We begin by stating the null hypothesis (Ho:). For example we state that the sample will
come from a population with a mean of 20.
Ho: μ = 20
4.State the Alternative Hypothesis (H1:)
We choose an alternative hypothesis (H1:). For example we state that the alternative
hypothesis comes from a population whose mean is not equal to 20.
Ha: μ ≠ 20
5.Choose a test statistic
A sample statistic (often a formula) that is used to decide whether to reject H0.
6.Choose a Rejection Region
We choose a rejection region such that the probability of rejecting the null hypothesis
incorrectly is equal to α.
7.Calculate the test statistic
We then calculate the test statistic and if the test statistic to see if it falls inside or outside the
rejection region.
8.Conclusion
We state our conclusions in the context of the question we are trying to answer.
79 Topic 7
ESTIMATION AND HYPOTHESIS TESTING
HYPOTHESIS TESTING TO MEAN (small sample)
The general rule of thumb for when to use a t score is when your sample:
•Has a sample size below 30,
•Has an unknown population standard deviation.
Example 1
The price of a popular tennis racket at a national chain store is $179. Affan bought five of the
same racket at an online auction site for the following prices:
155 179 175 175 161
Assuming that the auction prices of rackets are normally distributed, determine whether there
is sufficient evidence in the sample, at the 5% level of significance, to conclude that the
average price of the racket is less than $179 if purchased at an online auction.
Solution:
Step 1:
The assertion for which evidence must be provided is that the average online price μ is less
than the average price in retail stores, so the hypothesis test is
H0 : μ=179
Ha : μ<179 @ α=0.05
Step 2:
The sample is small and the population standard deviation is unknown. Thus the test statistic
is
and has the Student t-distribution with n – 1 = 5 – 1 = 4 degrees of freedom.
Topic 7 80
ESTIMATION AND HYPOTHESIS TESTING
Step 3:
From the data we compute x̅ = 169 and s = 10.39. Inserting these values into the formula for
the test statistic gives
Step 4:
Since the symbol in Ha is “<” this is a left-tailed test,
So there is a single critical value, −tα = −t0.05 [df=4].
Reading from the row labeled df = 4 in figure below and its value is −2.132. The rejection
region is (−∞,−2.132].
Step 5:
The test statistic falls in the rejection region. The decision is to reject H0. In the context of
the problem, the conclusion is:
The data provide sufficient evidence, at the 5% level of significance, to conclude that the
average price of such rackets purchased at online auctions is less than $179.
81 Topic 7
ESTIMATION AND HYPOTHESIS TESTING
Example 2
Is the temperature required to damage a computer on the average less than 110 degrees?
Because of the price of testing, twenty computers were tested to see what minimum
temperature will damage the computer. The damaging temperature averaged 109 degrees
with a standard deviation of 3 degrees. Assume that the distribution of all computers'
damaging temperatures is approximately normal. (use a = .05)
We test the hypothesis
H0: m = 110
H1: m < 110
We compute the t statistic:
This is a one tailed test, so we can go to our t-table with 19 degrees of freedom to find that
tc = 1.73
Since
-1.49 > -1.73
We see that the test statistic does not fall in the critical region. We fail to reject the null
hypothesis and conclude that there is insufficient evidence to suggest that the temperature
required to damage a computer on the average less than 110 degrees.
Topic 7 82
ESTIMATION AND HYPOTHESIS TESTING
HYPOTHESIS TESTING TO MEAN (large sample)
You must know the standard deviation of the population and your sample size should be above
30 in order for you to be able to use the z-score. Otherwise, use the t-score.
Example 1:
It is hoped that a newly developed pain reliever will more quickly produce perceptible reduction
in pain to patients after minor surgeries than a standard pain reliever. The standard pain
reliever is known to bring relief in an average of 3.5 minutes with standard deviation 2.1
minutes. To test whether the new pain reliever works more quickly than the standard one, 50
patients with minor surgeries were given the new pain reliever and their times to relief were
recorded. The experiment yielded sample mean x̄ = 3.1 minutes and sample standard deviation
s = 1.5 minutes. Is there sufficient evidence in the sample to indicate, at the 5% level of
significance, that the newly developed pain reliever does deliver perceptible relief more
quickly?
Solution:
Step 1
The natural assumption is that the new drug is no better than the old one, but must be proved
to be better. Thus if μ denotes the average time until all patients who are given the new drug
experience pain relief, the hypothesis test is
H0: μ = 3.5
Ha: μ < 3.5 @ α=0.05
Step 2
The sample is large, but the population standard deviation is unknown (the 2.1 minutes pertains
to the old drug, not the new one). Thus the test statistic is
83 Topic 7
ESTIMATION AND HYPOTHESIS TESTING
Step 3
Inserting the data into the formula for the test statistic gives
Step 4
Since the symbol in Ha is “<” this is a left-tailed test, so there is a single critical value, −zα =
−z0.05, which from the last line in we read off as −1.645. The rejection region is (−∞,
−1.645].
Step 5
The test statistic falls in the rejection region. The decision is to reject H0. In the context of
the problem our conclusion is:
The data provide sufficient evidence, at the 5% level of significance, to conclude that the
average time until patients experience perceptible relief from pain using the new pain
reliever is smaller than the average time for the standard pain reliever.
Example 2:
A cosmetics company fills its best-selling 8-ounce jars of facial cream by an automatic
dispensing machine. The machine is set to dispense a mean of 8.1 ounces per jar.
Uncontrollable factors in the process can shift the mean away from 8.1 and cause either
underfill or overfill, both of which are undesirable. In such a case the dispensing machine is
stopped and recalibrated. Regardless of the mean amount dispensed, the standard deviation
of the amount dispensed always has value 0.22 ounce. A quality control engineer routinely
selects 30 jars from the assembly line to check the amounts filled. On one occasion, the
sample mean is x̄ = 8.2 ounces and the sample standard deviation is s = 0.25 ounce.
Determine if there is sufficient evidence in the sample to indicate, at the 1% level of
significance, that the machine should be recalibrated.
Topic 7 84
ESTIMATION AND HYPOTHESIS TESTING
Solution:
Step 1
The natural assumption is that the machine is working properly. Thus if μ denotes the mean
amount of facial cream being dispensed, the hypothesis test is
H0: μ 8.1
Ha: μ ≠ 8.1 @ α = 0.01
Step 2
The sample is large and the population standard deviation is known. Thus the test statistic is
and has the standard normal distribution.
Step 3
Inserting the data into the formula for the test statistic gives
Step 4
Since the symbol in Ha is “≠” this is a two-tailed test, so there are two critical values, ±zα∕2
= ±z0.005, which from the last line in table we read off as ±2.576. The rejection region is
(−∞, −2.576] ∪ [2.576, ∞).
Step 5
Test statistic does not fall in the rejection region. The decision is not to reject H0. In the
context of the problem our conclusion is:
The data do not provide sufficient evidence, at the 1% level of significance, to conclude that
the average amount of product dispensed is different from 8.1 ounce. We conclude that the
machine does not need to be recalibrated.
85 Topic 7
ESTIMATION AND HYPOTHESIS TESTING
t-table
REFERENCES
1. Academic Succes Center (ASC). (2021). Retrieved from
https://ncu.libguides.com/statsresources/frequencytables
2. Bluman, A. G. (2004). Elementary Statistics. McGraw Hill.
3. Christian Heumann, M. S. (2016). Introduction to Statistics and Data Analysis.
Springer.
4. Faizah Omar, L. T. (2015). Statistics. Oxford Fajar Sdn Bhd.
5. Gareth James, D. W. (2013). An Introduction to Statistical Learning. Springer .