Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
Topic: 6. DATA DESCRIPTION
Sub-Topic: 6.1 Introduction to Data
Learning Outcomes: At the end of this lesson, students should be able to
(a) identify the discrete and continuous data.
*Include: Quantitative and qualitative variables.
(b) identify ungrouped and grouped data.
(c) construct and interpret stem-and-leaf diagrams.
*Interpretation on the shape of the distribution, bell shaped, skewed to the right or skewed to the left.
Statistics in our life
In the field of science, statistical techniques are used to analyze data that is created from
the experiment.
In manufacturing, quality control is achieved with the aid of statistics.
In the area of business, marketing surveys are carried out to determine the compatibility of
the product with the economics and social demand.
In the formation of the national policy, data from the census is used in economic and social
planning.
In the field of education, statistical techniques are used to analyze the progress of students
in an examination.
Definition
STATISTICS is a science that deals with collecting, organizing, summarizing, presenting
and analyzing data.
DESCRIPTIVE STATISTICS consists of techniques involving collecting, tabulating,
presenting and summarizing information in clear and effective ways in order to describe the
set of data.
POPULATION PARAMETER is a summary measure of a population (such as
population, means, variances, e.t.c.)
A SAMPLE is a set of measurements that constitute part or all of a population, i.e., a
sample is a subset of a population.
A VARIABLE is any measured characteristic or attribute that differs for different
subjects.
For example, if the weight of 30 subjects were measured, then weight would be a variable.
1 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
Click to view
Quantitative variables are measured on an ordinal, interval, or ratio scale.
Qualitative variables are measured on a nominal scale.
2 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
EXAMPLE 1
A survey was carried out on 30 women and 50 men in Kedah to find out whether they support
the action taken by the government to revoke the licenses of small petrol kiosks near the Thai
border. State
(a) the population.
(b) the sample.
(c) the variable.
(d) the type of variable, qualitative or quantitative.
EXAMPLE 2
Click to view
Categorise the following information into qualitative or quantitative data.
(a) Height of boys.
(b) Type of footwear.
(c) Age of lecturers.
(d) Colour of cars.
(e) Make of motorcycles.
3 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
(f) Number of A’s scored in a public examination.
EXAMPLE 3
Based on the following statements, determine either the data is discrete data or continuous data.
(a) The time taken to travel from Ipoh to Kuala Lumpur.
(b) The number of pens sold by a stationary shop.
(c) The diameter of ten spheres.
(d) The number of customers in a cinema in one day.
(e) The weight of new born babies in a hospital.
Introduction of Ungrouped and Grouped Data
UNGROUPED DATA GROUPED DATA
Ungrouped Data are listed as a sequence Grouped Data is grouped in interval, are categorized into
or in the form of a frequency table but mutually exclusive intervals, can be presented in
without the use of intervals. frequency distribution table, histogram, polygon, ogive.
(a) Sequence: Weight(kg) 30 40 40 50 50 60 60 70
12,13, 21, 27, 33, 34, 35, 37, 40, 40, 41 Frequency 2 8 6 5
(b) Frequency table:
Number of children 0 1 2 3 4
Number of families 4 6 7 2 1
Click to view
X
4 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
Ungrouped Data
Stem-and-Leaf Diagrams
A stem and leaf diagram will contain all of your data in all of its detail.
Also known as Stem Plots.
The shape of stem and leaf is the same as the histogram.
Each value in the data is divided into two parts as the stem and the leaf.
If all the data are two-digit numbers, the first digit is the stem and the second digit is the leaf.
For example, 58 can be divided as 5 8 where 5 is the stem and 8 is the leaf.
EXAMPLE 4
The length of 16 leaves of a certain tree, correct to the nearest 0.1 cm are given below.
4.4 5.9 6.0 5.7 6.2 5.3 8.0 5.0 6.3 5.7 6.6 5.5 4.3 5.9 4.8 8.1.
Construct a stem and leaf diagram to represent these figures.
5 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
Topic: 6. DATA DESCRIPTION
Sub-Topic: 6.2 Measures of Location
6.3 Measures of Dispersion
6 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
Learning Outcomes: At the end of this lesson, students should be able to
6.2 (a) find and interpret the mean, mode, median, quartiles and percentiles for ungrouped data.
(b) construct and interpret box-and-whisker plots for ungrouped data.
*Include: Lower fence, upper fence and outliers and data distribution interpretation.
6.3 (a) find and interpret variance and standard deviation for ungrouped data.
(c) find and interpret the Pearson’s Coefficient of Skewness.
*When coefficient is very close to 0 (negative or positive), the distribution of data is almost symmetrical.
Measures in Statistics
Measure of Location Measure of Dispersion
To determine the central value of a set of To determine the dispersion of a set of data
data (mean, median and mode). (range, interquatile range, standard deviation
and variance).
X
MEASURE OF LOCATION DEFINITON/FORMULA
x Sum of all data or x fx
Number of data f
x
n
Mean Add/ subtract a constant to each score, the mean will
change by adding(subtracting) that constant.
Multiply(or divide) each score by a constant, then the
mean will change by being multiplied by that constant.
Mode The mode of a set of data is the value that occurs most
frequently.
Median/Quartile/Percentile
Interquartile Range,
IQR Q3 Q1
Semi Interquartile Range, Pk xxss2,xs1 , if s ,
if s ,
SIQR 1 Q3 Q1
2
where s nk and s is the least integer greater than s.
100
7 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
MEASURE OF DISPERSION DEFINITION/FORMULA
n 1 n xi 2 1 fixi 2
Variance s2 i1xi2n i1 fixi2 n
Standard Deviation
n 1 or s2 n 1
s s2 variance
Add/ subtract a constant to each score, then the
standard deviation will NOT CHANGE
Multiply(or divide) each score by a constant, the the
standard deviation will change by being multiplied by
that constant.
Box and Whisker Plots
Another graphical representation of data.
Construct based on the lowest value, lower quartile, Q1, median, Q2, upper quartile,
Q3 and the highest value.
Can be represented horizontally or vertically.
Lower boundary/fence, Upper boundary/fence,
Q1 1.5Q Q Q3 1.5Q Q
3 1 3 1
Q1 1.5IQR Q3 1.5IQR
8 of 40
Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description
1.5Q3 Q1
1.5Q3 Q1
The Pearson’s Coefficient of Skewness
3mean median or Sk mean mode
standard deviation
Sk standard deviation
9 of 40
Lecture Note: 1 of 3
Ch. 6 Data Description
The skewness by Pearson’s Coefficient
Skewness Skewed to the LEFT Almost Symmetrical
Pearson’s Coefficient Sk 0.1 0.1 Sk 0
Interpretation on the shape of the distribution
Skewness Skewed to the LEFT
Graphs
Measure of Location Mean Median Mode M
Box-Plot
Central Tendency Q2 Q1 Q3 Q2
Median
10 o
Mathematics 2 SM025
Session 2021/2022
Symmetrical Almost Symmetrical Skewed to the RIGHT
Sk 0 0 Sk 0.1 Sk 0.1
Symmetrical Skewed to the RIGHT
Mean Median Mode Mode Median Mean
Q2 Q1 Q3 Q2 Q3 Q2 Q2 Q1
Mean Median
of 40
Lecture Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description Session 2021/2022
EXAMPLE 5
The stem and leaf diagram shows the number of flies caught in an insect trap for 27 days.
Stem Leaf
0 112
1 23556
2 223588
3 44445779
4 26778
Key: 1 2 means 12
(a) Find
(i) mean, mode and median.
(ii) Q1, Q3 and semi interquartile range.
(iii) 81th percentile.
11 of 40
Lecture Note: 1 of 3 Mathematics 2 SM025
Ch. 6 Data Description Session 2021/2022
(iv) variance and standard deviation.
(b) Illustrate the above data by constructing a box and whisker plot. Hence, describe the
skewness of the distribution.
12 of 40
Lecture Note: 2 of 3 Mathematics 2 SM025
Ch. 6 Data Description Session 2021/2022
EXAMPLE 6
The table shows the distribution of grades of students for a certain subject in an examination.
Grade 123456789
Number of Students 7 13 9 7 7 2 1 1 1
(a) Find
(i) mean, mode and median.
(ii) first quartile, third quartile and P12.
(iii) standard deviation.
13 of 40
Lecture Note: 2 of 3 Mathematics 2 SM025
Ch. 6 Data Description Session 2021/2022
(b) Construct the box and whisker plot. Hence, state the shape of distribution.
EXAMPLE 7
Given the two data as listed below:
Data I: 8, 18, 9, 10, 12, 16, 1 3, 15, 16, 13, 13
Data II: 11, 13, 13, 1 , 2, 23, 13, 14, 15, 1 8, 20
Find the mean and standard deviation for the above data and interpret the values obtained.
Data I Data II
14 of 40
Lecture Note: 2 of 3 Mathematics 2 SM025
Ch. 6 Data Description Session 2021/2022
EXAMPLE 8
The following is the systolic blood pressure, in mm Hg, of 10 patients in a hospital.
146 135 151 155 158 146 149 124 162 173
(a) Find the mean and mode. Describe the shape of the distribution.
(b) Find the standard deviation of the systolic blood pressure of the 10 patients. Hence, find the
Pearson’s coefficient of skewness. Comment on the distribution.
(c) Find the number of patients whose systolic blood pressures exceed one standard deviation
above or below the mean.
15 of 40
Lecture Note: 3 of 3 Mathematics 2 SM025
Ch. 6 Data Description
Topic: 6. DATA DESCRIPTION
Sub-Topic: 6.2 Measures of Location
6.3 Measures of Dispersion
Learning Outcomes: At the end of this lesson, students should be able to
6.2 (c) find and interpret the mean, mode, median, quartiles and percentiles for grouped data.
6.3 (b) find and interpret variance and standard deviation for grouped data.
(c) find and interpret the Pearson’s coefficient of skewness.
Grouped Data
Definition
Frequency distributions A table which the values for a variables are grouped into classes.
Determined by assuming the total relative frequency is 1 or 100%.
Relative frequency Relative frequency f
f
Class interval Bounded by the lower and upper limits of the class.
Class boundary The midpoint of the upper limit of one class and the lower limit
Class width, C of the next class.
Lower limit Upper boundary – lower boundary
The smallest value of the class limit.
Upper limit The largest value of the class limit.
Class mark/ lower limit upper limit
Mid point, xi 2
x
16 of 40
Lecture Note: 3 of 3 Mathematics 2 SM025
Ch. 6 Data Description
MEASURE OF LOCATION DEFINITON/FORMULA
Mean x f1x1 f2x2 ... fkxk
Mode f1 f2 ... fk
k
fixi
i1
k
fi
i1
xi midpoint of the class, fi frequency
Mode LB d1 d1 d2 C
LB Lower class boundary of mode class
d1 The difference between the mode class frequency
and the previous class frequency.
d2 The difference between mode class frequency and
the class frequency after the mode class frequency.
C Class width.
Median/Quartile/Percentile
Interquartile Range, nk Fk1
IQR Q3 Q1 100
Pk Lk C
fk
Semi Interquartile Range,
SIQR 1 Q3 Q1 Lk Lower class boundary of percentile class.
2 n Number of data or the sum of frequency.
Fk1 Cumulative frequency before percentile class.
C Class width.
fk Frequency of percentile class.
17 of 40
Lecture Note: 3 of 3 Mathematics 2 SM025
Ch. 6 Data Description
MEASURE OF DISPERSION DEFINITION/FORMULA
n 1 n xi 2 1 2
i1 n i1 n
fixi
Variance s2 xi2 fixi2
Standard deviation
n 1 or s2 n 1
s s2 variance
EXAMPLE 9
The following frequency table shows that the consultation time (rounded to the nearest minute)
needed by a doctor for a patient in a day.
Consultation Times, minutes Number of Patients
5
5–9 8
10 – 14 9
15 – 19 3
20 – 24 5
25 – 29
(a) Calculate the mean, mode and median.
18 of 40
Lecture Note: 3 of 3 Mathematics 2 SM025
Ch. 6 Data Description
(b) Calculate Q1, Q3, P20 and P90.
(c) Using the answer in (a), determine the skewness of the data distribution.
(d) Find the standard deviation. Hence, find the coefficient of skewness.
19 of 40
Lecture Note: 3 of 3 Mathematics 2 SM025
Ch. 6 Data Description
EXAMPLE 10
The following table shows the height distribution for a group of students.
Height, cm Frequency Cumulative Frequency
5 5
150 h 155 3 8
155 h 160 8 16
160 h 165 6 22
165 h 170 4 26
170 h 175 4 30
175 h 180
Find
(a) mean and median.
(b) first quartile, third quartile and interquartile range.
20 of 40
Lecture Note: 3 of 3 Mathematics 2 SM025
Ch. 6 Data Description
(c) 10th and 70th percentile.
(d) variance and standard deviation.
(e) Pearson’s coefficient of skewness and hence, describe the distribution.
21 of 40