PREFACE
A Guide to Assessment for Teachers: Basic Statistics was written to increase access
especially for educators to understand about statistics. This book covers basics concept of
statistics in related to classroom assessment. Each chapter starts with the theoretical
background, which is followed by a variety of elaboration and examples. This book focusses
on the concept of statistics in especially in collecting & arranging scores, generating mean,
mode & median from the score and explain data representation in the form of various graphs.
This book also includes producing standard score and difficulty & discrimination index.
2
CONTENT 4
7
1. Basic Statistics in Classroom 15
2. Collecting & Arranging Scores 18
3. Mean, Mode, and Median 23
4. Standard Deviation and Graph Representation 26
5. Standard Score 30
6. Difficulty and Discrimination Index
7. Reference
3
1. Basic Statistics in Classroom
Statistics is the science concerned with developing and studying methods for
collecting, analysing, interpreting, and presenting empirical data. Statistics is an
incredibly useful course and becomes an important part of life. Statistics have been
known to be a crucial tool in conducting research and presenting data. It also plays an
important role in the education nowadays. Statistics have been used in the education
system up until today and it is not only present as a unit of mathematics but have been
involved in all area and field.
Statistics will be able to help teachers to obtained and analyse the data that are
needed for the evaluation of their students and assessment used. Measurement and
evaluation are an essential part of the teaching and learning process. During the
process, teachers will carry out the suitable assessment toward the students and are
able to obtain scores and then interpret these scores to consider the suitable decisions
that need to be executed. Thus, by using statistics, teachers can study the scores
objectively and will makes the teaching and learning process more efficient.
Therefore, statistics is importance in the classroom, and it has several roles that will
help the teaching and learning process. The significance of statistics in classroom are
as follow:
1. Statistics help teacher to provide the most exact type of description.
When a teacher wants to gather information about a particular pupil on his
performance in classroom, the teacher will administer assessment or a test to the child.
Then from the result, teacher will be able to describe about the pupil’s performance or
trait. By using statistics, it will help the teacher to give an accurate description of the
data.
4
2. Teacher will be more definite and exact in his or her procedures and
thinking.
Lacking the technical knowledge will make the teachers become vague
and uncertain in describing the pupil’s performance. But using statistics enables
the teacher to describe pupil’s performance by using the proper language, and
symbols. Statistics will provide evidence that are needed by the teacher to
strengthen his or her descriptions and this will make the interpretation more
exact and definite.
3. Statistics enables the teacher to summarize the result.
Data that are obtained by the teacher will be arrange in an orderly
manner when statistics is used. This will help the teacher to make the data
precise and meaningful and present the data in an interpretable manner.
4. Teachers are able to draw general conclusions.
Implementing statistics in analysing the assessment or test will helps the
teacher to draw conclusions as well as extracting the conclusions about either
the pupils’ performances or the nature of the assessment itself. Teacher can
conclude whether the assessment used is valid and suitable for the pupils’
evaluation or not. The questions within the assessment can be analyse
according to its suitability and relations with the topic that are aimed to be tested.
Statistical steps also help the teacher to decide on how much faith should be
placed in any of the conclusion and about how far should the generalization be
extended.
5. Helps the teacher to predict pupils’ future performance.
Statistics will enable the teacher to predict how much of a thing will happen
under conditions that the teacher knows and have measured. For example, the
teacher can predict the possible score of a pupil in the final examination based on
the data from his entrance test score. However, due to different factors like the
amount of revision that have been done by the pupil, will affect the prediction and
make it incorrect. Statistical methods will help the teacher decide on how much
margin of error that can be allowed in making the predictions.
5
6. Teacher able to analyse some of the causal factors affecting the pupil’s
performance.
It is a common factor that numerous causal factors will resulting in a
behavioural outcome. The poor result of a particular pupil’s performance in a
specific subject is affected by a variety of reasons. For example, one pupil’s
may have gotten a high score in Mathematics test that only consist lots of
formula due to his passion and interest in memorizing the formulas. While be
getting a poor score in a test that made of reading comprehension due to his
lack of interest in reading long explanation sentences. So, using the appropriate
statistical methods, teacher can keep these irrelevant variables constant and
can observe the cause of failure of the pupil in a specific subject. The result
then can be applied to alter the questions within the test so that it will help the
teacher to get the wanted outcome of the pupil’s performance.
6
2. Collecting & Arranging Scores
After, collecting scores through using various instruments, the data will be arrange
using frequency distribution, but what frequency distribution.
A frequency distribution is a representation, either in a graphical or tabular format, that
displays the number of observations within a given interval. It is one of the ways to
organize data.
Frequency = the number of repetitive scores in a score or accumulated scores
(grouped/ ungrouped).
Distribution = the arrangement of raw scores from test scores
There are 7 things to know in related to frequency distribution:
1 Form a frequency distribution table
2 Line graph
3 Cumulative frequency distribution table
4 Cumulative frequency curve (Ogive)
5 Histogram
6 Frequency polygon
7 Frequency curves
7
Here are the examples:
1. Form a frequency distribution (fd) table
Fd table is a chart summarizing the values and their frequencies.
The marks obtained by the students in the last exam:
Ungrouped data:
56, 10, 21, 75, 33, 48, 60, 68, 21, 70, 68, 82, 21, 44, 56, 21, 68, 75, 75, 91
Table 1: The test scores of 20 students
Marks obtained in Tally Number of students
the test (frequency)
1
10 /
4
21 //// 1
33 / 1
44 / 1
48 / 2
56 // 1
60 / 3
68 /// 1
70 / 3
75 /// 1
82 / 1
91 /
Total 20
8
2. Line graph
Line graphs are useful in that they show data variables and trends very clearly
and can help to make predictions about the results of data not yet recorded.
Figure 1: Line graph of frequency vs marks obtained by 20 students
3. Cumulative frequency distribution (fd) table
We can build a fd table using class intervals to tally the frequency for the data
that belongs to that specific class interval.
Steps to make a cumulative frequency distribution table are as follows:
Step 1: Use the continuous variables to set up a frequency distribution
table using a suitable class length.
Step 2: Find the frequency for each class interval.
Step 3: Locate the endpoint for each class interval (upper limit or lower
limit).
Step 4: Calculate the cumulative frequency by adding the numbers in
the frequency column.
9
Step 5: Record all results in the table.
Table 2: Cumulative frequency distribution (fd) table
Marks (x) Class Marks (x) Class Frequency (f) Cumulative cf(%)
Int. Limit frequency (cf)
5
0-20 0-20.5 1 1 30
55
21-40 20.5-40.5 5 1+5 = 6 90
100
41-60 40.5-60.5 5 6+5 = 11
61-80 60.5-80.5 7 11+7 = 18
81-100 80.5-100.5 2 18+2 = 20
Total 20
4. Cumulative frequency curve (Ogive)
Cumulative frequency distribution of grouped data can be represented on a
graph. Such a representative graph is called a cumulative frequency curve or
an ogive.
Figure 2: Ogive Graph of cumulative frequency vs marks obtained
10
5. Histogram
Steps to make a histogram are as follows:
Step 1: Use Marks (x) Class Int. and Frequency (f) from the Cumulative
frequency distribution (fd) table.
Step 2: Insert a column chart.
Step 3: Plot the data of Marks (x) Class Int. at x-axis and frequency at y-axis.
Step 4: The column chart will be automatically created. If there is a gap between
each column bar, remove the gap width. Double click at the column bar and
put gap width at zero percent.
Histogram of Frequency Distribution against Class Interval
8
7
6
5
Frequency 4 Frequency (f)
3
2
1
0 21-40 41-60 61-80 81-100
0-20
Class Interval
Figure 3: Histogram of Frequency Distribution against Class Interval
6. Frequency Polygon
Steps to make a frequency polygon are as follows:
Step 1: Calculate the midpoints. Use the formula below
Midpoint = (Upper limit + Lower limit) / 2
Step 2: Create corresponding midpoints as highlighted in grey below and put
frequency to 0. These points are used only to give a closed shape to the
polygon.
11
Table 3: Corresponding table
Marks (x) Class Int. Midpoints Frequency (f)
00
0-20 10 1
21-40 30.5 5
41-60 50.5 5
61-80 70.5 7
81-100 90.5 2
110.5 0
Step 3: Insert a line chart.
Step 4: Plot the data of midpoints at x-axis and frequency at y-axis.
Frequency Polygon Frequency (f)
8
7
6
5
Frequency 4
3
2
1
0
0 10 30.5 50.5 70.5 90.5 110.5
Midpoints
Figure 4: Frequency Polygon- Frequency against Midpoint
12
7. Frequency Curve
Step 1: Create a histogram by plotting midpoints and frequency at x-axis and
y-axis respectively.
Histogram
8 Frequency (f)
7
6
5
Frequency 4
3
2
1
0
10 30.5 50.5 70.5 90.5
Midpoints
Figure 5: Histogram of Frequency against Midpoints
Step 2: Insert a polynomial trendline from layout section.
Frequency Curve
8 Frequency (f)
7 Poly. (Frequency (f))
6
5
Frequency 4
3
2
1
0
10 30.5 50.5 70.5 90.5
Midpoints
Figure 6: Overlay Graph - Frequency Curve and Histogram
13
Step 3: Hide the histogram
Frequency Curve
8 Frequency (f)
7 Poly. (Frequency (f))
6
5
Frequency 4
3
2
1
0
10 30.5 50.5 70.5 90.5
Midpoints
Figure 7: Frequency Curve
14
3. Mean, Mode, and Median
In statistics, there is a concept known as measure of central tendency. This concept
is to measures of location within a distribution. Three measures of central tendency
are the mode, the median, and the mean. These measures are essential to find out
the degree of difficulty, the suitability of the test and compare the achievement.
Mode
The mode, symbolized Mo, is the most frequent score. The number of scores
on either side of the mode does not have to be equal. Finding the distance of each
score from the mode will decide how far the data are from the centre. The mode is the
least stable of the three measures of central tendency. This means that it will probably
vary most from one sample to the next.
Median
The median, symbolized Mdn, is the middle score. It cuts the distribution in half,
so that there are the same numbers of scores above the median as there are below
the median. The median is the 50th percentile. To find the median, the data should be
arranged either from the least to greatest or from the greatest to the least value. A
median is a number that is separated by the higher half of a data sample. Formula for
odd numbers of observation; Median = {(n+1)/2}th term and formula for even numbers
of observation; Median = [(n/2)th term + {(n/2)+1}th]/2.
Mean
The mean, symbolized M is an average of the given numbers: a calculated
central value of a set of numbers. It is the average of the set of values. The formula
for mean; Mean = (Sum of all the observations/Total number of observations).
Example:
The score of English subject for 10 students are as follows.
70, 73, 58, 65, 69, 61, 70, 65, 63, 65
15
Table 4: Frequency table
Score 58 61 63 65 69 70 73
Frequency 1 1 1 3 1 2 1
1. Find the mode of the data: Mode = Highest frequency = 65.
2. Find the median of the data:
Arrange in ascending order; 58, 61, 63, 65, 65, 65, 69, 70, 70, 73
Even number: [(n/2)th term + {(n/2)+1}th]/2
[5th term + 6th term] / 2 = (65 + 65)/2 = 65 is the median.
3. Find the mean of the data:
Mean = (Sum of all the observations/Total number of observations)
Mean = 659/10 = 65.9
By using Mean, Median and Mode teachers can make decisions in assessment. We
can use these central tendency measures to describe students’ achievement and the
assessment. First teachers need to construct frequency curve based on the data from
the assessment. There are three types of curves that could be identified.
Normal Distribution Curve
If the graph is normally distributed, it means that it has the means = mode =
median. This type of curve means that the assessment is fair and equal. 68% of the
student’s scores will lies within one standard deviation from the mean. Teacher can
use z- score to further analyse the data.
Positively Skewed Curve
If the graph is positively skewed, it means that it has the mode < median <
mean. The positively skewed graph means that many students achieve lower grades.
This means that students’ achievements are poor, or the test items are difficult.
16
Negatively Skewed Curve
If the graph is negatively skewed, it means that it has the mean < median <
mode. The negatively skewed graph means that students score higher than mean
scores. It also means that students perform better, or the test is too easy.
17
4.Standard Deviation and Graph Representation
Standard Deviation
A standard deviation (SD) is a quantity derived from the distribution of scores from a
normative sample. The standard is the average distance (or deviation) from the mean.
Standard deviation is used norm – reference test to identify language impairment.
How to calculate standard deviation?
The formula for standard deviation is:
∑- "sum of"
x - value in the data set
μ - mean of the data set
N - the number of data points in the population
To find the standard deviation of set values:
a) Find the mean of the data
b) Find the difference (deviation) for each of the scores and the mean
c) Square each deviation
d) Sum up the squares
e) Dividing by one less than the number of values, find the “mean” of this sum (the
variance)
f) Find the square root of the variance (the standard deviation)
18
Example:
Find the variance and the standard deviation of the following scores of an
Exam.
40, 45, 35, 48, 40, 38
Solution:
1. Find the mean of the data
mean: 40 + 45 + 35 + 48 + 40 + 38 = 246 = 41
66
2. Find the difference between each score and mean
Table 5: Difference between score and mean
Score Score - mean Difference from mean
-1
40 40-41 4
-6
45 45-41 7
-1
35 35-41 -3
48 48-41
40 40-41
38 38-41
19
3. Square each of these differences and sum them.
Table 6: Difference Squared Difference Squared
Difference 1
-1 16
4 36
-6 49
7 1
-1 9
-3
Sum of the squares → 112
The sum of the squares is 112
4. Find the mean of this sum (the variance)
112/6 = 18.667
5. Find the square root of this variance
√18.667 = 4.3205
The standard deviation of the scores 4.3205; the variance is 18.667
20
Graph Representation
A normal distribution, sometimes called the bell curve, is commonly seen in statistics
as a tool to understand standard deviation. The bell curve is symmetrical. Half of the
data will fall to the left of the mean, half fall to the right.
Normal distributions have three characteristics that are easy to spot in graphs:
● The mean, median and mode are the same.
● The distribution is symmetric about the mean—half the values fall below the
mean and half above the mean.
● The distribution can be described by two values: the mean and the standard
deviation.
Figure 8: Normal distribution graph
The standard deviation controls the spread of the distribution. A smaller standard
deviation indicates that the data is tightly clustered around the mean; the normal
distribution will be taller. A larger standard deviation indicates that the data is spread
out around the mean; the normal distribution will be flatter and wider.
21
How standard deviation is used to make decisions in assessment?
1. Grading Tests. For instance, a class of students took a math test. Their teacher
wants to know whether most students are performing at the same level, or if
there is a high standard deviation.
2. Comparing test scores for different schools. The standard deviation will tell you
how diverse the test scores are for each school.
3. Determine student’s overall achievement. Teacher can use standard deviation
to find out whether the students are equivalent or diverse in groups or spread.
22
5.Standard Score
Simply explained, a standard score (also known as a z-score) indicates how far
a particular observed value is from the mean. In more specific terms, it is a measure
of how many standard deviations a raw score is whether it is below or above the
population mean (Glen, 2021). If the value of standard score is zero (Z = 0), it means
that the observed value fell exactly on the mean. Standard scores range from -3
standard deviations up to +3 standard deviations. Negative scores would fall to the far
left of the normal distribution curve. On the other hand, the positive scores would fall
to the far right of the normal distribution curve. The standard normal distribution refers
to a particular normal distribution with a mean of 0 and a standard deviation of 1
(Bhandari, 2020). It is better known as the z-distribution.
Figure 9: Standard Normal Distribution
Formula:
−
=
− ̅
=
23
A standard score can be used for numerous reasons. These are some of the
purposes of a standard score:
1) To compare and contrast a student's performance on a test with another
student.
2) To compare several subjects in a test.
3) To compare grades and marks in a test.
The standard T-score is obtained when the standard z-score is converted and
scaled to have a mean of 50 and a standard deviation of 10 as the basis in its
distribution curve. It is used when the sample:
• Has a sample size less than 30
• The population standard deviation is unknown.
Thus, the population's standard deviation and the sample size must be more
than 30 in order to use the standard z-score (Glen, 2021). Otherwise, the
standard T-score should be used.
Formula:
Standard T-score = 50 + 10 Z
The goal of changing the mean and standard deviation to 50 and 10 is to make
comparisons easier by making all the values positive.
Examples of calculating Standard T-score
In the first semester, Alex scored 70 in Mrs. Jenny’s math class. The
average score of the class was 60 and the standard deviation was 15.
24
For this semester, Alex is in Mr. Smith’s class. He scored 88. The mean
score was 90 and the standard deviation was 4. In which class did Alex perform
better?
First semester class:
− ̅
=
Z = 70-60
15
Z = 0.667
T-score = 50 + 10 (0.667)
= 56.67
Current semester: = − ̅
Z = 88-90
4
= -0.5
T-score = 50 + 10 (-0.5)
= 45
Based on the T-score, it can be concluded that Alex performed better in
the first semester (56.67) in Mrs. Jenny’s class since the T-score is higher than
in Mr. Smith’s class (45).
25
6.Difficulty and Discrimination Index
How to process the followings and how the information is used to make decisions in
assessment.
Item analysis is a technique that evaluates the effectiveness of items in tests. Two
principles measured used in item analysis are item difficulty and item discrimination.
Item Analysis
Item Difficulty (P) or Difficulty Index Item Discrimination (DI) or Discrimination
Measure proportion of examinees who Index
responded to an item (question) correctly.
Index range between 0.00 – 1.00 Item discrimination is a measure how well an
High value = item is easy item discriminates between examinees with
Low value = item is difficult high achiever and low achiever
P = No. of students answers correctly Index range between -1 and +1
Total Students
Values close to +1 indicates good
discrimination.
Values near 0 indicates poor discrimination
VDaluI e=s nReha-Rrl-1 indicateRshit=emnumisbeeraosfiehirghfor low-
scor0in.5g(Teoxtaalmstinudeeen. ts) achiever who answer
correctly.
Rl = number of low
achiever who answer
correctly
Total students = Total both
groups
Range Item Interpretation Action
Difficulty Index
P < 0.3 Difficult Revised / Discard item
0.3 < P < 0.8 Moderate Item accepted
P > 0.8 Easy Revised / Discard item
Discrimination Index
D > 0.4 Excellent Item accepted
0.2 < D < 0.4 Moderate Accepted or improved
0 < D < 0.2 Low Revised
D<0 Poor Rejected / Revised
26
Steps in calculating the item difficulty of a test item
1. Count the total number of students answering each item correctly.
2. For each item, divide the number answering correctly by the total number of
students.
3. This gives you the proportion of students who answered each item correctly.
This figure is called the item's difficulty level.
• Caution: The higher the difficulty level the easier the item, and vice
versa.
4. Use the following formula to calculate the item discrimination of the item:
No. of students answers correctly
Total Students
Steps in calculating the item discrimination of a test
item
1. Arrange the students who took the test in descending order according to their
total score on the test.
2. Divide the students into two sections - the upper group and the lower group -
based on their scores and using 50% of the total number of students as the
upper group: 50% and 50% as the lower group.
3. For each item, count the number of students in the upper group who got the
item correct and the number of students in the lower group who got it correct.
4. Use the following formula to calculate the item discrimination of the item:
Upper group – Lower group
50% of total students
Here is the example :
Ten students have sat for a test. The test consists of 10 multiple choice
questions. In the table below, the students’ scores have been listed from high
to low (Student 1-5 are in the upper half) and (Student 6-10 in the lower half).
The use of * indicates a correct answer on the question.
27
Question Numbering 1 2 3 4 5 6 7 8 9 10 Total Score
Answer Key AB C D E A B C D D 10/10
Student 1 A* B* C* D* E* A* B* C* D* D* 10
Student 2 C B* C* D* E* A* B* C* D* D* 9
Student 3 C B* C* B E* A* B* C* D* D* 8
Student 4 C B* E D* E* A* B* C* D* D* 8
Student 5 B B* C* D* E* A* C C* B D* 7
Student 6 A* B* C* D* E* A* B* A A A 7
Student 7 C C C* B E* A* B* B D* D* 6
Student 8 A* B* B D* E* B C C* B D* 6
Student 9 B D C* A E* A* B* C* D* B 6
Student 10 C B* D D* E* B B* A D* A 5
Question 1 Correct Correct Difficulty (p) Discrimination (d)
Question 2 (Upper group) (Lower group)
Question 3 No. of students answers correctly Rh-Rl
Question 4 1 2 Total students 0.5(Total students)
Question 5 5 3
Question 6 4 3 0.3 -0.1
Question 7 4 3 0.8 0.4
Question 8 5 5 0.7 0.2
5 3 0.7 0.2
Question 9 4 4
Question 10 5 2 1 0
4 3 0.8 0.4
5 2 0.8 0
0.7 0.6
0.7 0.2
0.7 0.6
In this example, the students with the high scores are listed first and in
descending order, thus automatically separating the students into high and low
groups. The two groups of students must be equal. Therefore, in cases where
there is an odd number of students, the middle scoring student must be left out.
28
Next, the questions are checked for their discrimination index. The
number of “lower” students who answered a test item correctly is subtracted
from the number of “upper” students who answered that same test item
correctly and is divided by half of the total number of students. With
discrimination scores, the higher the value the more discriminating the item,
meaning that more students with high test grades got it correct than students
with low test grades. Discrimination scores for test questions should be 0.20 or
higher. If in case a test question has a discrimination score of 0 or close to it, it
means that it was answered correctly by more low testers than high testers and
is not a well-constructed test question. A discrimination score of 0 signals a
question that was most likely worded ambiguously and should be eliminated.
In closing, you may want to pose these questions to yourself:
• Which question was the easiest? Answer: Question 5
• Which question was the most difficult? Answer: Question 1
• Which question has the poorest discrimination? Answer: Question 1
• Which question(s) should be eliminated? Answer: Question 5 (too easy)
and Question 1 (for its negative discrimination score, not because of its
difficulty index)
29
7.Reference
Ahmad , S. (2017, November 8). Role of Statistics in Education. SlideShare.
Retrieved by 6th July 2021, from
https://www.slideshare.net/SarfrazAhmad2/role-of-statistics-in-education.
Bhandari, P. (2020, November 9). The Standard Normal Distribution: Examples,
Explanations, Uses. Scribbr. https://www.scribbr.com/statistics/standard-
normal-distribution/
Glen, S. (2021, June 7). T-Score vs. Z-Score: What's the Difference? Statistics How
To. https://www.statisticshowto.com/probability-and-statistics/hypothesis-
testing/t-score-vs-z-score/
Glen, S. (2021, May 29). Z-Score: Definition, Formula and Calculation. Statistics
How To. https://www.statisticshowto.com/probability-and-statistics/z-score/
Smart, P., & Smart, P. (2021, March 16). The Importance of Statistics in Education -
Statistical Analysis. Machinep. Retrieved by 6th July 2021, from
https://machinep.com/importance-of-statistics-in-education/.
Tarmuji, Nor & Syed Wahid, Sharifah Norhuda. (2013). Statistics by Statistician:
Importance in Education. Retrieved by 4th July 2021, from
https://www.researchgate.net/publication/305767450_Statistics_by_Statistician
_Importance_in_Education.
30
31