CHAPTER 1:
INTRODUCTION TO
STATISTICS
Nurul Najwa Jatarona
STA1113
CHAPTER INTRODUCTION TO STATISTICS
1
▪ The Role of Statistics
▪ Types of Statistics
• Descriptive Statistics
• Inferential Statistics
▪ Statistical Methods in Research
▪ The Circulation of Research Process
▪ Level of Measurement
1.0 The Role of Statistics
Statistics plays a vital role in many fields of human activity. Individuals and organizations
use statistics to understand data and make informed decisions throughout the natural
and social sciences, medicine, business, and other areas. For instance, in the
administration of a government, statistics are used in developing public policies and
social programs. In the banking sector, bankers use statistical approaches based on
probability to estimate the numbers of depositors and their claims for a certain day.
1.1 Types of Statistics
Statistics is divided into two types:
Descriptive and Inferential Statistics.
Descriptive Statistics
Descriptive statistics are methods used to
describe data that has been collected. These
include the classification of data, the drawing
of histogram that correspond to the frequency
distributions that result after the data are
classified, the representation of data by other
sorts of graphs, such as line graphs, bar
graphs, pie charts, and the computation of
mean, standard deviations, and ranges.
Example 1
A survey was conducted on a random sample of students taking mathematics courses at
UiTM Perak (Affective Characteristics and Academic Performance in Mathematics, N
Ahmad, N Mohammad, S K Sayed Nordin, Z Mansor, 2010). One of the objectives of the
survey was to compare the performance of students in Calculus II. The line graph in Figure
1 revealed that performance for the November 2009 examination was better than the
performance for the April 2010 examination for Programs A, B and C.
STA1113 1
CHAPTER INTRODUCTION TO STATISTICS
1
Fig. 1 Passing rate for MAT199 in November, 2009 and April 2010 examinations
Inferential Statistics
Inferential statistics refers to the use of samples to reach conclusions about the population
from which those samples have been drawn.
Example 2
A researcher takes all UiTM Perak students as a sample and calculates the average
number of books the students read in a year. That statistic is then used as an estimate of
the average number of books read in a year for all university students.
Example 3
Auditors usually take a sample of a company’s records. They then use the information
from the sample to infer the characteristics of the entire population of records to conclude
whether the company has been following accounting standards.
Example 4
The survey by Ahmad et.al (2010) also investigated incomes of students’ parents. A
statistical procedure called the independent samples t-test was carried out to determine
whether there is a significant difference in father’s income between CS110 students and
CS111 students. The SPSS output is reproduced below:
STA1113 2
CHAPTER INTRODUCTION TO STATISTICS
1
The output revealed that the mean difference in the incomes between the two groups is
RM334.28 but the difference is not significant. We infer at 95% confidence that there is no
significant difference in fathers’ incomes between all CS110 and CS111 students.
1.2 Statistical Methods in Research
Some well-known statistical tests and procedures are:
➢ Analysis of variance (ANOVA)
➢ Chi-square test
➢ Correlation
➢ Mann-Whitney U
➢ Regression analysis
➢ Spearman’s rank correlation coefficient
➢ Student’s t-test
➢ Time series analysis
STA1113 3
CHAPTER INTRODUCTION TO STATISTICS
1
1.3 The Circulation of Research Process
Asking New Asking the
Question Question
Reconsider Identify the
the Theory Important
Factors
Working Formulating
with the a Hypothesis
Hypothesis
Collecting
Testing the Relevant
Hypothesis Information
© 2009 Pearson Prentice Hall, Salkind.
1.4 Steps in Research Process
RESEARCH
PROCESS
INQUIRY COLLECTION
Identify research Set research Collect research
topic question material
Define research Determine research
problem method
ANALYSIS PRESENTATION
Analyze and Write research
interpret research report
material
STA1113 4
CHAPTER INTRODUCTION TO STATISTICS
1
1.5 Levels of Measurement
Data may also be described in accordance with the level of measurement attained. The
four levels of measurement are – from weakest to strongest level – nominal, ordinal,
interval and ratio scales.
RATIO
INTERVAL
ORDINAL
NOMINAL
Nominal scales
The observed data are merely classified into various distinct categories in which
no ordering is implied.
Ordinal scales
The observed data are classified into various a distinct category in which ordering
is implied.
Interval scales
An interval scale is an ordered scale in which the difference between the
measurements is a meaningful quantity. However, there is no true zero.
Ratio scales
The ratio-scaled data has all the characteristics of the interval scales plus there is
a true zero point so that ratios of measurements are sensible to consider.
Example 5
What level of measurement do these statements represent?
1. Types of flowers in a park (Roses, Lily, Hibiscus) -Nominal
2. Car brands that penetrate in Malaysia (Honda, Volkswagen, Toyota, Nissan) –
Nominal
3. Test score in a mid-year examination in a school - Ordinal
4. Colour of cat’s eyes (Black, Green, Blue) - Nominal
5. Age of students in UiTM Perak - Ratio
6. Height of students in a class - Ratio
7. Monthly amount spends in Mydin Hypermarket - Ratio
8. Amount of time queuing in a bank – Ratio
9. Temperature in a lab – Interval
10. Monthly earning of a technician – Ratio
STA1113 5
CHAPTER INTRODUCTION TO STATISTICS
1
1.6 Population and Sample
A population is the collection of all possible observations of a specified characteristic of
interest.
A sample is a selected subset of a population which is being investigated. It is a sub-
collection of elements drawn from the population.
Why “Sample” the Population? Why not study the whole population?
• It is physically impossible to check all items in the population.
• The cost of studying all the items in a population may be very high.
• The sample results are usually adequate.
• Contacting the whole population would often be time-consuming.
• Certain tests are destructive in nature (e.g., study of light bulb life).
STA1113 6
CHAPTER INTRODUCTION TO STATISTICS
1
Statisticians advocate Probability / Random Sampling (Random sampling)
❖ A probability sample is a sample selected in such a way that each item or person in
the population being studied has a known likelihood of being included in the sample.
❖ If we use judgment sampling we will have no idea about the accuracy of our estimates
since we have no idea about the quality of judgments. Probability sampling enables
us to construct probabilistic error bounds.
❖ The aim of random sampling is to get a sample which is representative of the
population. This will ensure that inferences from the sample to the population will be
valid.
1.7 Sampling Methods
Simple Random
Sampling
Systematic
Probability Stratified
Sampling Cluster
Methods Multi-Stage
Sampling
Convenience
Non Probability Snowball
Quota
Judgemental
Sampling
STA1113 7
CHAPTER INTRODUCTION TO STATISTICS
1
STA1113 8
CHAPTER INTRODUCTION TO STATISTICS
1
STA1113 9
CHAPTER INTRODUCTION TO STATISTICS
1
STA1113 10
CHAPTER INTRODUCTION TO STATISTICS
1
1.8 Statistic versus Parameter
POPULATION
(Parameter)
Select sample from Calculate sample
population statistic (x, s, s2)
to estimate
population
parameters (µ, ,
2)
Sample (statistic)
STA1113 11
CHAPTER INTRODUCTION TO STATISTICS
1
Parameter is a characteristic of a population. It is a measurement that explains the
population.
Statistic is any function of observations in a random sample. It is a measurement that
explains sample and also known as estimator.
Symbols and Measurements
Measurement Parameter Statistic
(Estimn ator)
Size N
Mean ̅
Variance 2 2
Standard
PDreovpiaotritoionn p S
̂
Copyright © 2020 Six-Sigma-Material.com.
1.9 Relationship between Variable Descriptions.
Variable
Qualitative Quantitative
(descriptive) (numeric)
ordinal Nominal Ratio Interval
(rank-ordered) (difference only) (doesn't exist at (equal interval)
some point)
Ordinal
(rank-ordered)
STA1113 12
CHAPTER INTRODUCTION TO STATISTICS
1
TUTORIAL
1. Determine the following statement whether descriptive or inferential statistics.
a) A manager plotted daily sales made by the sales executive and it shows that the
sales are increasing.
b) A quality control executive detects 10% defective tires after inspecting 300
samples of tires.
c) The average age of the students in statistics class is 20 years.
d) There is a relationship between smoking cigarettes and getting a lung cancer.
2. What level of measurement do these statements represent?
a) Types of flowers in a garden
b) Test score in a final examination
c) Monthly earning of a manager of a fast food restaurant
d) Type of houses (terrace, semi-D, double-storey terrace)
3. Categorize the following statements according to their level of measurements:
a) Weights of parcels (in grams)
b) Academic qualifications (Diploma, Degree, Masters, PhD)
c) House ownership (Yes, No)
d) Type of property owned (House, Estate, Office)
e) Age at first marriage (in years)
f) Measurement of longitude
g) Time required to run a mile (in minutes and seconds)
h) Type of account (savings, current, fixed)
i) Arbitrary labels are usually used to represent the categories. For example, the
labels ‘0’ for female, ‘1’ for male are normally used for the variable Gender
j) Age group (infant, teenager, adult)
k) Hotel ratings (5 star, 4 star)
l) Temperature in Celsius or Fahrenheit
m) The year you were born.
n) Product satisfaction (Satisfied, fairly satisfied, Neutral, Fairly unsatisfied,
Unsatisfied)
o) The categories are mutually exclusive and bear no relationship to one another.
p) Money in the bank (in RM)
4. Identify the following as nominal level, ordinal level, interval level, or ratio level data.
a) Flavors of frozen yogurt ________________
b) Amount of money in savings accounts________________
c) Students classified by their reading ability: Above average, Below average, Normal
________________
d) Letter grades on an English essay ________________
e) Religions ________________
f) Commuting times to work ____________
STA1113 13
CHAPTER INTRODUCTION TO STATISTICS
1
g) Ages (in years) of art students ________________
h) Ice cream flavor preference ________________
i) Years of important historical events ________________
j) Instructors classified as: Easy, Difficult or Impossible ________________
5. As part of a test preparation course, students are asked to take a practice version of
the Graduate Record Examination (GRE). This is a standardized test. Scores can
range from 200 to 800 with a population mean of 500 and a population standard
deviation of 100. Choose the appropriate scale of measurement.
6. Children in elementary school are evaluated and classified as non-readers (0),
beginning readers (1), grade level readers (2), or advanced readers (3). The
classification is done in order to place them in reading groups. Choose the appropriate
scale of measurement.
7. During a clinical interview, survivors of a tornado are asked to state “no” or “yes” to
whether they have experienced specific symptoms of Post-Traumatic Stress Disorder
(PTSD) in the past week. The number “0” is assigned to “no” and the number “1” is
assigned to “yes”. Choose the appropriate scale of measurement.
8. Emory University wants to know which dormitories the students prefer. The
administration counts the number of applications for each dorm. Administrators
assign a rank to each dorm based on the number of applications received. Choose
the appropriate scale of measurement.
9. Determine the level of measurement. (Nominal, Ordinal, Interval, Ratio)
a) Cars described as compact, midsize, and full-size.
b) Colors of M&M candies.
c) Weights of M&M candies.
d) types of markers (washable, permanent, etc.)
e) time it takes to sing the National Anthem.
f) total annual income for statistics students.
g) body temperatures of bears in the north pole.
h) teachers being rated as superior, above average, average, below average, or poor.
10. Identify as Qualitative or Quantitative. If it is quantitative, tell if it is discrete or
continuous.
a) hair color of the math teachers at PHS.
b) the number of people that prefer Pepsi over Coke
c) the weight of your sister’s car (in pounds)
d) the number of criminal indictments against Michael Vick
e) the length of his jail sentence
f) your telephone area code
g) how fast you were going when you were pulled over for speeding down Main
Street (in MPH)
h) the way you felt when you were pulled over for speeding down Main Street
STA1113 14
CHAPTER INTRODUCTION TO STATISTICS
1
11. Label each as descriptive or inferential.
a) The highest batting average for one season (so far) is 0.438. (Achieved by Hugh
Duffy in 1894.)
b) Last year’s homicide rate was 9.2 persons per 100,000 US residents.
c) Sixty-two percent of last year’s high school graduates are now enrolled in a college
or university.
d) Results of the US census conducted every 10 years
e) A recent study found that eating garlic has a 15% chance of lowering blood
pressure.
f) It is predicted that the average number of automobiles each household owns will
increase next year.
g) The chance that a person will be robbed in Greenville, NC is 15%.
h) Last year’s total attendance at Long Run High School’s football games was 8,235.
12. Write the correct term for the definition provided below.
a) A characteristic that can assume different values
b) Collecting, organizing, analyzing, and drawing conclusions based on data
c) The values a variable can assume
d) The entire group to be studied
e) A subgroup of the entire group to be studied (the smaller group out of the big group)
13. Classify each sample as random, systematic, cluster, convenience, or stratified.
a) In a large school district, all teachers from two buildings are interviewed to
determine whether they believe the students have less homework to do now than
in previous years. 32. Every seventh customer entering a shopping mall is asked
to select his/her favorite store.
b) Nursing supervisors are selected using random numbers in order to determine
annual salaries.
c) Every 100th hamburger manufactured is checked to determine its fat content.
random.
d) Mail carriers of a large city are divided into four groups according to gender and
according to whether they walk or ride on their routes. The 10 are selected from
each group and interviewed to determine whether they have been bitten by a dog
in the last year.
14. A researcher asks hospitalized individuals about their comfort in a new type of hospital gown.
This is an example of what type of data?
A. ratio
B. independent
C. quantitative
D. qualitative
15. A nurse practitioner measures how many times per minute a heart beats when an individual
is at rest versus when running. She is measuring the heartbeat at what level of measurement?
A. interval/ratio
B. nominal
C. independent
D. ordinal
STA1113 15
CHAPTER INTRODUCTION TO STATISTICS
1
16. The research nurse is coding adults according to size. A person with a below-average body
mass index (BMI) is coded as 1, average is 2, and above average is 3. What level of
measurement is this?
A. nominal
B. ratio
C. ordinal
D. interval
STA1113 16