The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

Research Methodology Methods and Techniques SECOND REVISED EDITION

Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by tnks26, 2020-08-02 01:53:53

Research Methodology Methods and Techniques SECOND REVISED EDITION

Research Methodology Methods and Techniques SECOND REVISED EDITION

Research Design 33

FEATURES OF A GOOD DESIGN

A good design is often characterised by adjectives like flexible, appropriate, efficient, economical
and so on. Generally, the design which minimises bias and maximises the reliability of the data
collected and analysed is considered a good design. The design which gives the smallest experimental
error is supposed to be the best design in many investigations. Similarly, a design which yields maximal
information and provides an opportunity for considering many different aspects of a problem is
considered most appropriate and efficient design in respect of many research problems. Thus, the
question of good design is related to the purpose or objective of the research problem and also with
the nature of the problem to be studied. A design may be quite suitable in one case, but may be found
wanting in one respect or the other in the context of some other research problem. One single design
cannot serve the purpose of all types of research problems.

A research design appropriate for a particular research problem, usually involves the consideration
of the following factors:

(i) the means of obtaining information;

(ii) the availability and skills of the researcher and his staff, if any;

(iii) the objective of the problem to be studied;

(iv) the nature of the problem to be studied; and

(v) the availability of time and money for the research work.

If the research study happens to be an exploratory or a formulative one, wherein the major
emphasis is on discovery of ideas and insights, the research design most appropriate must be flexible
enough to permit the consideration of many different aspects of a phenomenon. But when the purpose
of a study is accurate description of a situation or of an association between variables (or in what are
called the descriptive studies), accuracy becomes a major consideration and a research design which
minimises bias and maximises the reliability of the evidence collected is considered a good design.
Studies involving the testing of a hypothesis of a causal relationship between variables require a
design which will permit inferences about causality in addition to the minimisation of bias and
maximisation of reliability. But in practice it is the most difficult task to put a particular study in a
particular group, for a given research may have in it elements of two or more of the functions of
different studies. It is only on the basis of its primary function that a study can be categorised either
as an exploratory or descriptive or hypothesis-testing study and accordingly the choice of a research
design may be made in case of a particular study. Besides, the availability of time, money, skills of the
research staff and the means of obtaining the information must be given due weightage while working
out the relevant details of the research design such as experimental design, survey design, sample
design and the like.

IMPORTANT CONCEPTS RELATING TO RESEARCH DESIGN

Before describing the different research designs, it will be appropriate to explain the various concepts
relating to designs so that these may be better and easily understood.

1. Dependent and independent variables: A concept which can take on different quantitative
values is called a variable. As such the concepts like weight, height, income are all examples of
variables. Qualitative phenomena (or the attributes) are also quantified on the basis of the presence

3 4 Research Methodology

or absence of the concerning attribute(s). Phenomena which can take on quantitatively different
values even in decimal points are called ‘continuous variables’.* But all variables are not continuous.
If they can only be expressed in integer values, they are non-continuous variables or in statistical
language ‘discrete variables’.** Age is an example of continuous variable, but the number of children
is an example of non-continuous variable. If one variable depends upon or is a consequence of the
other variable, it is termed as a dependent variable, and the variable that is antecedent to the dependent
variable is termed as an independent variable. For instance, if we say that height depends upon age,
then height is a dependent variable and age is an independent variable. Further, if in addition to being
dependent upon age, height also depends upon the individual’s sex, then height is a dependent variable
and age and sex are independent variables. Similarly, readymade films and lectures are examples of
independent variables, whereas behavioural changes, occurring as a result of the environmental
manipulations, are examples of dependent variables.

2. Extraneous variable: Independent variables that are not related to the purpose of the study, but
may affect the dependent variable are termed as extraneous variables. Suppose the researcher
wants to test the hypothesis that there is a relationship between children’s gains in social studies
achievement and their self-concepts. In this case self-concept is an independent variable and social
studies achievement is a dependent variable. Intelligence may as well affect the social studies
achievement, but since it is not related to the purpose of the study undertaken by the researcher, it
will be termed as an extraneous variable. Whatever effect is noticed on dependent variable as a
result of extraneous variable(s) is technically described as an ‘experimental error’. A study must
always be so designed that the effect upon the dependent variable is attributed entirely to the
independent variable(s), and not to some extraneous variable or variables.

3. Control: One important characteristic of a good research design is to minimise the influence or
effect of extraneous variable(s). The technical term ‘control’ is used when we design the study
minimising the effects of extraneous independent variables. In experimental researches, the term
‘control’ is used to refer to restrain experimental conditions.

4. Confounded relationship: When the dependent variable is not free from the influence of
extraneous variable(s), the relationship between the dependent and independent variables is said to
be confounded by an extraneous variable(s).

5. Research hypothesis: When a prediction or a hypothesised relationship is to be tested by scientific
methods, it is termed as research hypothesis. The research hypothesis is a predictive statement that
relates an independent variable to a dependent variable. Usually a research hypothesis must contain,
at least, one independent and one dependent variable. Predictive statements which are not to be
objectively verified or the relationships that are assumed but not to be tested, are not termed research
hypotheses.

6. Experimental and non-experimental hypothesis-testing research: When the purpose of
research is to test a research hypothesis, it is termed as hypothesis-testing research. It can be of the
experimental design or of the non-experimental design. Research in which the independent variable
is manipulated is termed ‘experimental hypothesis-testing research’ and a research in which an
independent variable is not manipulated is called ‘non-experimental hypothesis-testing research’. For
instance, suppose a researcher wants to study whether intelligence affects reading ability for a group

* A continuous variable is that which can assume any numerical value within a specific range.
** A variable for which the individual values fall on the scale only with distinct gaps is called a discrete variable.

Research Design 35

of students and for this purpose he randomly selects 50 students and tests their intelligence and
reading ability by calculating the coefficient of correlation between the two sets of scores. This is an
example of non-experimental hypothesis-testing research because herein the independent variable,
intelligence, is not manipulated. But now suppose that our researcher randomly selects 50 students
from a group of students who are to take a course in statistics and then divides them into two groups
by randomly assigning 25 to Group A, the usual studies programme, and 25 to Group B, the special
studies programme. At the end of the course, he administers a test to each group in order to judge the
effectiveness of the training programme on the student’s performance-level. This is an example of
experimental hypothesis-testing research because in this case the independent variable, viz., the type
of training programme, is manipulated.

7. Experimental and control groups: In an experimental hypothesis-testing research when a
group is exposed to usual conditions, it is termed a ‘control group’, but when the group is exposed to
some novel or special condition, it is termed an ‘experimental group’. In the above illustration, the
Group A can be called a control group and the Group B an experimental group. If both groups A and
B are exposed to special studies programmes, then both groups would be termed ‘experimental
groups.’ It is possible to design studies which include only experimental groups or studies which
include both experimental and control groups.

8. Treatments: The different conditions under which experimental and control groups are put are
usually referred to as ‘treatments’. In the illustration taken above, the two treatments are the usual
studies programme and the special studies programme. Similarly, if we want to determine through an
experiment the comparative impact of three varieties of fertilizers on the yield of wheat, in that case
the three varieties of fertilizers will be treated as three treatments.

9. Experiment: The process of examining the truth of a statistical hypothesis, relating to some
research problem, is known as an experiment. For example, we can conduct an experiment to
examine the usefulness of a certain newly developed drug. Experiments can be of two types viz.,
absolute experiment and comparative experiment. If we want to determine the impact of a fertilizer
on the yield of a crop, it is a case of absolute experiment; but if we want to determine the impact of
one fertilizer as compared to the impact of some other fertilizer, our experiment then will be termed
as a comparative experiment. Often, we undertake comparative experiments when we talk of designs
of experiments.

10. Experimental unit(s): The pre-determined plots or the blocks, where different treatments are
used, are known as experimental units. Such experimental units must be selected (defined) very
carefully.

DIFFERENT RESEARCH DESIGNS

Different research designs can be conveniently described if we categorize them as: (1) research
design in case of exploratory research studies; (2) research design in case of descriptive and diagnostic
research studies, and (3) research design in case of hypothesis-testing research studies.

We take up each category separately.

1. Research design in case of exploratory research studies: Exploratory research studies are
also termed as formulative research studies. The main purpose of such studies is that of formulating
a problem for more precise investigation or of developing the working hypotheses from an operational

3 6 Research Methodology

point of view. The major emphasis in such studies is on the discovery of ideas and insights. As such
the research design appropriate for such studies must be flexible enough to provide opportunity for
considering different aspects of a problem under study. Inbuilt flexibility in research design is needed
because the research problem, broadly defined initially, is transformed into one with more precise
meaning in exploratory studies, which fact may necessitate changes in the research procedure for
gathering relevant data. Generally, the following three methods in the context of research design for
such studies are talked about: (a) the survey of concerning literature; (b) the experience survey and
(c) the analysis of ‘insight-stimulating’ examples.

The survey of concerning literature happens to be the most simple and fruitful method of
formulating precisely the research problem or developing hypothesis. Hypotheses stated by earlier
workers may be reviewed and their usefulness be evaluated as a basis for further research. It may
also be considered whether the already stated hypotheses suggest new hypothesis. In this way the
researcher should review and build upon the work already done by others, but in cases where
hypotheses have not yet been formulated, his task is to review the available material for deriving the
relevant hypotheses from it.

Besides, the bibliographical survey of studies, already made in one’s area of interest may as well
as made by the researcher for precisely formulating the problem. He should also make an attempt to
apply concepts and theories developed in different research contexts to the area in which he is
himself working. Sometimes the works of creative writers also provide a fertile ground for hypothesis-
formulation and as such may be looked into by the researcher.

Experience survey means the survey of people who have had practical experience with the
problem to be studied. The object of such a survey is to obtain insight into the relationships between
variables and new ideas relating to the research problem. For such a survey people who are competent
and can contribute new ideas may be carefully selected as respondents to ensure a representation of
different types of experience. The respondents so selected may then be interviewed by the investigator.
The researcher must prepare an interview schedule for the systematic questioning of informants.
But the interview must ensure flexibility in the sense that the respondents should be allowed to raise
issues and questions which the investigator has not previously considered. Generally, the experience-
collecting interview is likely to be long and may last for few hours. Hence, it is often considered
desirable to send a copy of the questions to be discussed to the respondents well in advance. This will
also give an opportunity to the respondents for doing some advance thinking over the various issues
involved so that, at the time of interview, they may be able to contribute effectively. Thus, an experience
survey may enable the researcher to define the problem more concisely and help in the formulation
of the research hypothesis. This survey may as well provide information about the practical possibilities
for doing different types of research.

Analysis of ‘insight-stimulating’ examples is also a fruitful method for suggesting hypotheses
for research. It is particularly suitable in areas where there is little experience to serve as a guide.
This method consists of the intensive study of selected instances of the phenomenon in which one is
interested. For this purpose the existing records, if any, may be examined, the unstructured interviewing
may take place, or some other approach may be adopted. Attitude of the investigator, the intensity of
the study and the ability of the researcher to draw together diverse information into a unified
interpretation are the main features which make this method an appropriate procedure for evoking
insights.

Research Design 37

Now, what sort of examples are to be selected and studied? There is no clear cut answer to it.
Experience indicates that for particular problems certain types of instances are more appropriate
than others. One can mention few examples of ‘insight-stimulating’ cases such as the reactions of
strangers, the reactions of marginal individuals, the study of individuals who are in transition from one
stage to another, the reactions of individuals from different social strata and the like. In general,
cases that provide sharp contrasts or have striking features are considered relatively more useful
while adopting this method of hypotheses formulation.

Thus, in an exploratory of formulative research study which merely leads to insights or hypotheses,
whatever method or research design outlined above is adopted, the only thing essential is that it must
continue to remain flexible so that many different facets of a problem may be considered as and
when they arise and come to the notice of the researcher.

2. Research design in case of descriptive and diagnostic research studies: Descriptive research
studies are those studies which are concerned with describing the characteristics of a particular
individual, or of a group, whereas diagnostic research studies determine the frequency with which
something occurs or its association with something else. The studies concerning whether certain
variables are associated are examples of diagnostic research studies. As against this, studies concerned
with specific predictions, with narration of facts and characteristics concerning individual, group or
situation are all examples of descriptive research studies. Most of the social research comes under
this category. From the point of view of the research design, the descriptive as well as diagnostic
studies share common requirements and as such we may group together these two types of research
studies. In descriptive as well as in diagnostic studies, the researcher must be able to define clearly,
what he wants to measure and must find adequate methods for measuring it along with a clear cut
definition of ‘population’ he wants to study. Since the aim is to obtain complete and accurate information
in the said studies, the procedure to be used must be carefully planned. The research design must
make enough provision for protection against bias and must maximise reliability, with due concern for
the economical completion of the research study. The design in such studies must be rigid and not
flexible and must focus attention on the following:

(a) Formulating the objective of the study (what the study is about and why is it being made?)

(b) Designing the methods of data collection (what techniques of gathering data will be adopted?)

(c) Selecting the sample (how much material will be needed?)

(d) Collecting the data (where can the required data be found and with what time period should
the data be related?)

(e) Processing and analysing the data.

(f) Reporting the findings.

In a descriptive/diagnostic study the first step is to specify the objectives with sufficient precision
to ensure that the data collected are relevant. If this is not done carefully, the study may not provide
the desired information.

Then comes the question of selecting the methods by which the data are to be obtained. In other
words, techniques for collecting the information must be devised. Several methods (viz., observation,
questionnaires, interviewing, examination of records, etc.), with their merits and limitations, are available
for the purpose and the researcher may user one or more of these methods which have been discussed
in detail in later chapters. While designing data-collection procedure, adequate safeguards against

3 8 Research Methodology

bias and unreliability must be ensured. Whichever method is selected, questions must be well examined
and be made unambiguous; interviewers must be instructed not to express their own opinion; observers
must be trained so that they uniformly record a given item of behaviour. It is always desirable to pre-
test the data collection instruments before they are finally used for the study purposes. In other
words, we can say that “structured instruments” are used in such studies.

In most of the descriptive/diagnostic studies the researcher takes out sample(s) and then wishes
to make statements about the population on the basis of the sample analysis or analyses. More often
than not, sample has to be designed. Different sample designs have been discussed in detail in a
separate chapter in this book. Here we may only mention that the problem of designing samples
should be tackled in such a fashion that the samples may yield accurate information with a minimum
amount of research effort. Usually one or more forms of probability sampling, or what is often
described as random sampling, are used.

To obtain data free from errors introduced by those responsible for collecting them, it is necessary
to supervise closely the staff of field workers as they collect and record information. Checks may be
set up to ensure that the data collecting staff perform their duty honestly and without prejudice. “As
data are collected, they should be examined for completeness, comprehensibility, consistency and
reliability.”2

The data collected must be processed and analysed. This includes steps like coding the interview
replies, observations, etc.; tabulating the data; and performing several statistical computations. To
the extent possible, the processing and analysing procedure should be planned in detail before actual
work is started. This will prove economical in the sense that the researcher may avoid unnecessary
labour such as preparing tables for which he later finds he has no use or on the other hand, re-doing
some tables because he failed to include relevant data. Coding should be done carefully to avoid
error in coding and for this purpose the reliability of coders needs to be checked. Similarly, the
accuracy of tabulation may be checked by having a sample of the tables re-done. In case of mechanical
tabulation the material (i.e., the collected data or information) must be entered on appropriate cards
which is usually done by punching holes corresponding to a given code. The accuracy of punching is
to be checked and ensured. Finally, statistical computations are needed and as such averages,
percentages and various coefficients must be worked out. Probability and sampling analysis may as
well be used. The appropriate statistical operations, along with the use of appropriate tests of
significance should be carried out to safeguard the drawing of conclusions concerning the study.

Last of all comes the question of reporting the findings. This is the task of communicating the
findings to others and the researcher must do it in an efficient manner. The layout of the report needs
to be well planned so that all things relating to the research study may be well presented in simple and
effective style.

Thus, the research design in case of descriptive/diagnostic studies is a comparative design throwing
light on all points narrated above and must be prepared keeping in view the objective(s) of the study
and the resources available. However, it must ensure the minimisation of bias and maximisation of
reliability of the evidence collected. The said design can be appropriately referred to as a survey
design since it takes into account all the steps involved in a survey concerning a phenomenon to be
studied.

2 Claire Selltiz et al., op. cit., p. 74.

Research Design 39

The difference between research designs in respect of the above two types of research studies
can be conveniently summarised in tabular form as under:

Table 3.1
Type of study

Research Design Exploratory of Formulative Descriptive/Diagnostic
Overall design
Flexible design (design must provide Rigid design (design must make
(i) Sampling design opportunity for considering different enough provision for protection
aspects of the problem) against bias and must maximise
(ii) Statistical design reliability)
(iii) Observational
Non-probability sampling design Probability sampling design (random
design (purposive or judgement sampling) sampling)
(iv) Operational design
No pre-planned design for analysis Pre-planned design for analysis

Unstructured instruments for Structured or well thought out
collection of data instruments for collection of data

No fixed decisions about the Advanced decisions about
operational procedures operational procedures.

3. Research design in case of hypothesis-testing research studies: Hypothesis-testing research
studies (generally known as experimental studies) are those where the researcher tests the hypotheses
of causal relationships between variables. Such studies require procedures that will not only reduce
bias and increase reliability, but will permit drawing inferences about causality. Usually experiments
meet this requirement. Hence, when we talk of research design in such studies, we often mean the
design of experiments.

Professor R.A. Fisher’s name is associated with experimental designs. Beginning of such designs
was made by him when he was working at Rothamsted Experimental Station (Centre for Agricultural
Research in England). As such the study of experimental designs has its origin in agricultural research.
Professor Fisher found that by dividing agricultural fields or plots into different blocks and then by
conducting experiments in each of these blocks, whatever information is collected and inferences
drawn from them, happens to be more reliable. This fact inspired him to develop certain experimental
designs for testing hypotheses concerning scientific investigations. Today, the experimental designs
are being used in researches relating to phenomena of several disciplines. Since experimental designs
originated in the context of agricultural operations, we still use, though in a technical sense, several
terms of agriculture (such as treatment, yield, plot, block etc.) in experimental designs.

BASIC PRINCIPLES OF EXPERIMENTAL DESIGNS

Professor Fisher has enumerated three principles of experimental designs: (1) the Principle of
Replication; (2) the Principle of Randomization; and the (3) Principle of Local Control.

4 0 Research Methodology

According to the Principle of Replication, the experiment should be repeated more than once.
Thus, each treatment is applied in many experimental units instead of one. By doing so the statistical
accuracy of the experiments is increased. For example, suppose we are to examine the effect of two
varieties of rice. For this purpose we may divide the field into two parts and grow one variety in one
part and the other variety in the other part. We can then compare the yield of the two parts and draw
conclusion on that basis. But if we are to apply the principle of replication to this experiment, then we
first divide the field into several parts, grow one variety in half of these parts and the other variety in
the remaining parts. We can then collect the data of yield of the two varieties and draw conclusion by
comparing the same. The result so obtained will be more reliable in comparison to the conclusion we
draw without applying the principle of replication. The entire experiment can even be repeated
several times for better results. Conceptually replication does not present any difficulty, but
computationally it does. For example, if an experiment requiring a two-way analysis of variance is
replicated, it will then require a three-way analysis of variance since replication itself may be a
source of variation in the data. However, it should be remembered that replication is introduced in
order to increase the precision of a study; that is to say, to increase the accuracy with which the main
effects and interactions can be estimated.

The Principle of Randomization provides protection, when we conduct an experiment, against
the effect of extraneous factors by randomization. In other words, this principle indicates that we
should design or plan the experiment in such a way that the variations caused by extraneous factors
can all be combined under the general heading of “chance.” For instance, if we grow one variety of
rice, say, in the first half of the parts of a field and the other variety is grown in the other half, then it
is just possible that the soil fertility may be different in the first half in comparison to the other half. If
this is so, our results would not be realistic. In such a situation, we may assign the variety of rice to
be grown in different parts of the field on the basis of some random sampling technique i.e., we may
apply randomization principle and protect ourselves against the effects of the extraneous factors (soil
fertility differences in the given case). As such, through the application of the principle of randomization,
we can have a better estimate of the experimental error.

The Principle of Local Control is another important principle of experimental designs. Under it
the extraneous factor, the known source of variability, is made to vary deliberately over as wide a
range as necessary and this needs to be done in such a way that the variability it causes can be
measured and hence eliminated from the experimental error. This means that we should plan the
experiment in a manner that we can perform a two-way analysis of variance, in which the total
variability of the data is divided into three components attributed to treatments (varieties of rice in our
case), the extraneous factor (soil fertility in our case) and experimental error.* In other words,
according to the principle of local control, we first divide the field into several homogeneous parts,
known as blocks, and then each such block is divided into parts equal to the number of treatments.
Then the treatments are randomly assigned to these parts of a block. Dividing the field into several
homogenous parts is known as ‘blocking’. In general, blocks are the levels at which we hold an
extraneous factor fixed, so that we can measure its contribution to the total variability of the data by
means of a two-way analysis of variance. In brief, through the principle of local control we can
eliminate the variability due to extraneous factor(s) from the experimental error.

* See Chapter Analysis of Variance for details.

Research Design 41

Important Experimental Designs

Experimental design refers to the framework or structure of an experiment and as such there are
several experimental designs. We can classify experimental designs into two broad categories, viz.,
informal experimental designs and formal experimental designs. Informal experimental designs are
those designs that normally use a less sophisticated form of analysis based on differences in magnitudes,
whereas formal experimental designs offer relatively more control and use precise statistical
procedures for analysis. Important experiment designs are as follows:

(a) Informal experimental designs:
(i) Before-and-after without control design.
(ii) After-only with control design.
(iii) Before-and-after with control design.

(b) Formal experimental designs:
(i) Completely randomized design (C.R. Design).
(ii) Randomized block design (R.B. Design).
(iii) Latin square design (L.S. Design).
(iv) Factorial designs.

We may briefly deal with each of the above stated informal as well as formal experimental designs.

1. Before-and-after without control design: In such a design a single test group or area is
selected and the dependent variable is measured before the introduction of the treatment. The treatment
is then introduced and the dependent variable is measured again after the treatment has been
introduced. The effect of the treatment would be equal to the level of the phenomenon after the
treatment minus the level of the phenomenon before the treatment. The design can be represented thus:

Test area: Level of phenomenon Treatment Level of phenomenon
before treatment (X) introduced after treatment (Y)

Treatment Effect = (Y) – (X)

Fig. 3.1

The main difficulty of such a design is that with the passage of time considerable extraneous
variations may be there in its treatment effect.

2. After-only with control design: In this design two groups or areas (test area and control area)
are selected and the treatment is introduced into the test area only. The dependent variable is then
measured in both the areas at the same time. Treatment impact is assessed by subtracting the value
of the dependent variable in the control area from its value in the test area. This can be exhibited in
the following form:

4 2 Research Methodology

Test area: Treatment introduced Level of phenomenon after
Control area: Treatment Effect = (Y) – (Z) treatment (Y)

Level of phenomenon without
treatment (Z)

Fig. 3.2

The basic assumption in such a design is that the two areas are identical with respect to their
behaviour towards the phenomenon considered. If this assumption is not true, there is the possibility
of extraneous variation entering into the treatment effect. However, data can be collected in such a
design without the introduction of problems with the passage of time. In this respect the design is
superior to before-and-after without control design.

3. Before-and-after with control design: In this design two areas are selected and the dependent
variable is measured in both the areas for an identical time-period before the treatment. The treatment
is then introduced into the test area only, and the dependent variable is measured in both for an
identical time-period after the introduction of the treatment. The treatment effect is determined by
subtracting the change in the dependent variable in the control area from the change in the dependent
variable in test area. This design can be shown in this way:

Time Period I Time Period II

Test area: Level of phenomenon Treatment Level of phenomenon
before treatment (X) introduced after treatment (Y)

Control area: Level of phenomenon Level of phenomenon
without treatment without treatment

(A) (Z)

Treatment Effect = (Y – X) – (Z – A)

Fig. 3.3

This design is superior to the above two designs for the simple reason that it avoids extraneous
variation resulting both from the passage of time and from non-comparability of the test and control
areas. But at times, due to lack of historical data, time or a comparable control area, we should prefer
to select one of the first two informal designs stated above.

4. Completely randomized design (C.R. design): Involves only two principles viz., the principle
of replication and the principle of randomization of experimental designs. It is the simplest possible
design and its procedure of analysis is also easier. The essential characteristic of the design is that
subjects are randomly assigned to experimental treatments (or vice-versa). For instance, if we have
10 subjects and if we wish to test 5 under treatment A and 5 under treatment B, the randomization
process gives every possible group of 5 subjects selected from a set of 10 an equal opportunity of
being assigned to treatment A and treatment B. One-way analysis of variance (or one-way ANOVA)*
is used to analyse such a design. Even unequal replications can also work in this design. It provides
maximum number of degrees of freedom to the error. Such a design is generally used when
experimental areas happen to be homogeneous. Technically, when all the variations due to uncontrolled

* See Chapter 11 for one-way ANOVA technique.

Research Design 43

extraneous factors are included under the heading of chance variation, we refer to the design of
experiment as C.R. design.

We can present a brief description of the two forms of such a design as given in Fig 3.4.

(i) Two-group simple randomized design: In a two-group simple randomized design, first
of all the population is defined and then from the population a sample is selected randomly.
Further, requirement of this design is that items, after being selected randomly from the
population, be randomly assigned to the experimental and control groups (Such random
assignment of items to two groups is technically described as principle of randomization).
Thus, this design yields two groups as representatives of the population. In a diagram form
this design can be shown in this way:

Experimental Treatment A Independent variable
group
Population Randomly Sample Randomly
selected assigned Control
group
Treatment B

Fig. 3.4: Two-group simple randomized experimental design (in diagram form)

Since in the sample randomized design the elements constituting the sample are randomly
drawn from the same population and randomly assigned to the experimental and control
groups, it becomes possible to draw conclusions on the basis of samples applicable for the
population. The two groups (experimental and control groups) of such a design are given
different treatments of the independent variable. This design of experiment is quite common
in research studies concerning behavioural sciences. The merit of such a design is that it is
simple and randomizes the differences among the sample items. But the limitation of it is
that the individual differences among those conducting the treatments are not eliminated,
i.e., it does not control the extraneous variable and as such the result of the experiment may
not depict a correct picture. This can be illustrated by taking an example. Suppose the
researcher wants to compare two groups of students who have been randomly selected
and randomly assigned. Two different treatments viz., the usual training and the specialised
training are being given to the two groups. The researcher hypothesises greater gains for
the group receiving specialised training. To determine this, he tests each group before and
after the training, and then compares the amount of gain for the two groups to accept or
reject his hypothesis. This is an illustration of the two-groups randomized design, wherein
individual differences among students are being randomized. But this does not control the
differential effects of the extraneous independent variables (in this case, the individual
differences among those conducting the training programme).

44 Research Methodology

Population Population
(Available (Available to
for study)
conduct
treatments)

Random selection Random selection

Sample Sample
(To be studied) (To conduct
treatments)

Random Group 1 E Random
assignment Group 2 E assignment
Group 3 E
Group 4 E E = Experimental group
C = Control group
Group 5 C
Group 6 C
Group 7 C
Group 8 C

Treatment A Treatment B

Independent variable
or causal variable

Fig. 3.5: Random replication design (in diagram form)

(ii) Random replications design: The limitation of the two-group randomized design is usually
eliminated within the random replications design. In the illustration just cited above, the
teacher differences on the dependent variable were ignored, i.e., the extraneous variable
was not controlled. But in a random replications design, the effect of such differences are
minimised (or reduced) by providing a number of repetitions for each treatment. Each
repetition is technically called a ‘replication’. Random replication design serves two purposes
viz., it provides controls for the differential effects of the extraneous independent variables
and secondly, it randomizes any individual differences among those conducting the treatments.
Diagrammatically we can illustrate the random replications design thus: (Fig. 3.5)

Research Design 45

From the diagram it is clear that there are two populations in the replication design. The
sample is taken randomly from the population available for study and is randomly assigned
to, say, four experimental and four control groups. Similarly, sample is taken randomly from
the population available to conduct experiments (because of the eight groups eight such
individuals be selected) and the eight individuals so selected should be randomly assigned to
the eight groups. Generally, equal number of items are put in each group so that the size of
the group is not likely to affect the result of the study. Variables relating to both population
characteristics are assumed to be randomly distributed among the two groups. Thus, this
random replication design is, in fact, an extension of the two-group simple randomized
design.

5. Randomized block design (R.B. design) is an improvement over the C.R. design. In the R.B.
design the principle of local control can be applied along with the other two principles of experimental
designs. In the R.B. design, subjects are first divided into groups, known as blocks, such that within
each group the subjects are relatively homogeneous in respect to some selected variable. The variable
selected for grouping the subjects is one that is believed to be related to the measures to be obtained
in respect of the dependent variable. The number of subjects in a given block would be equal to the
number of treatments and one subject in each block would be randomly assigned to each treatment.
In general, blocks are the levels at which we hold the extraneous factor fixed, so that its contribution
to the total variability of data can be measured. The main feature of the R.B. design is that in this
each treatment appears the same number of times in each block. The R.B. design is analysed by the
two-way analysis of variance (two-way ANOVA)* technique.

Let us illustrate the R.B. design with the help of an example. Suppose four different forms of a
standardised test in statistics were given to each of five students (selected one from each of the five
I.Q. blocks) and following are the scores which they obtained.

Very low Low Average High Very high
I.Q. I.Q. I.Q. I.Q. I.Q.

Form 1 Student Student Student Student Student
Form 2 A B C D E
Form 3 82 67 57
Form 4 71 73
90 68 54
70 81
86 73 51
69 84
93 77 60
65 71

Fig. 3.6

If each student separately randomized the order in which he or she took the four tests (by using
random numbers or some similar device), we refer to the design of this experiment as a R.B. design.
The purpose of this randomization is to take care of such possible extraneous factors (say as fatigue)
or perhaps the experience gained from repeatedly taking the test.

* See Chapter 11 for the two-way ANOVA technique.

4 6 Research Methodology

6. Latin square design (L.S. design) is an experimental design very frequently used in agricultural
research. The conditions under which agricultural investigations are carried out are different from
those in other studies for nature plays an important role in agriculture. For instance, an experiment
has to be made through which the effects of five different varieties of fertilizers on the yield of a
certain crop, say wheat, it to be judged. In such a case the varying fertility of the soil in different
blocks in which the experiment has to be performed must be taken into consideration; otherwise the
results obtained may not be very dependable because the output happens to be the effect not only of
fertilizers, but it may also be the effect of fertility of soil. Similarly, there may be impact of varying
seeds on the yield. To overcome such difficulties, the L.S. design is used when there are two major
extraneous factors such as the varying soil fertility and varying seeds.

The Latin-square design is one wherein each fertilizer, in our example, appears five times but is
used only once in each row and in each column of the design. In other words, the treatments in a L.S.
design are so allocated among the plots that no treatment occurs more than once in any one row or
any one column. The two blocking factors may be represented through rows and columns (one
through rows and the other through columns). The following is a diagrammatic form of such a design
in respect of, say, five types of fertilizers, viz., A, B, C, D and E and the two blocking factor viz., the
varying soil fertility and the varying seeds:

FERTILITY LEVEL
I II III IV V

Seeds differences X1 A B C D E
X2 B C D E A
X3 C D E A B
X4 D E A B C
X5 E A B C D

Fig. 3.7

The above diagram clearly shows that in a L.S. design the field is divided into as many blocks as
there are varieties of fertilizers and then each block is again divided into as many parts as there are
varieties of fertilizers in such a way that each of the fertilizer variety is used in each of the block
(whether column-wise or row-wise) only once. The analysis of the L.S. design is very similar to the
two-way ANOVA technique.

The merit of this experimental design is that it enables differences in fertility gradients in the field
to be eliminated in comparison to the effects of different varieties of fertilizers on the yield of the
crop. But this design suffers from one limitation, and it is that although each row and each column
represents equally all fertilizer varieties, there may be considerable difference in the row and column
means both up and across the field. This, in other words, means that in L.S. design we must assume
that there is no interaction between treatments and blocking factors. This defect can, however, be
removed by taking the means of rows and columns equal to the field mean by adjusting the results.
Another limitation of this design is that it requires number of rows, columns and treatments to be

Research Design 47

equal. This reduces the utility of this design. In case of (2 × 2) L.S. design, there are no degrees of
freedom available for the mean square error and hence the design cannot be used. If treatments are
10 or more, than each row and each column will be larger in size so that rows and columns may not
be homogeneous. This may make the application of the principle of local control ineffective. Therefore,
L.S. design of orders (5 × 5) to (9 × 9) are generally used.

7. Factorial designs: Factorial designs are used in experiments where the effects of varying more
than one factor are to be determined. They are specially important in several economic and social
phenomena where usually a large number of factors affect a particular problem. Factorial designs
can be of two types: (i) simple factorial designs and (ii) complex factorial designs. We take them
separately

(i) Simple factorial designs: In case of simple factorial designs, we consider the effects of
varying two factors on the dependent variable, but when an experiment is done with more
than two factors, we use complex factorial designs. Simple factorial design is also termed
as a ‘two-factor-factorial design’, whereas complex factorial design is known as ‘multi-
factor-factorial design.’ Simple factorial design may either be a 2 × 2 simple factorial
design, or it may be, say, 3 × 4 or 5 × 3 or the like type of simple factorial design. We
illustrate some simple factorial designs as under:

Illustration 1: (2 × 2 simple factorial design).
A 2 × 2 simple factorial design can graphically be depicted as follows:

2 × 2 SIMPLE FACTORIAL DESIGN

Experimental Variable

Control variables Treatment A Treatment B
Level I Cell 1 Cell 3
Level II Cell 2 Cell 4

Fig. 3.8

In this design the extraneous variable to be controlled by homogeneity is called the control
variable and the independent variable, which is manipulated, is called the experimental variable. Then
there are two treatments of the experimental variable and two levels of the control variable. As such
there are four cells into which the sample is divided. Each of the four combinations would provide
one treatment or experimental condition. Subjects are assigned at random to each treatment in the
same manner as in a randomized group design. The means for different cells may be obtained along
with the means for different rows and columns. Means of different cells represent the mean scores
for the dependent variable and the column means in the given design are termed the main effect for
treatments without taking into account any differential effect that is due to the level of the control
variable. Similarly, the row means in the said design are termed the main effects for levels without
regard to treatment. Thus, through this design we can study the main effects of treatments as well as

4 8 Research Methodology

the main effects of levels. An additional merit of this design is that one can examine the interaction
between treatments and levels, through which one may say whether the treatment and levels are
independent of each other or they are not so. The following examples make clear the interaction
effect between treatments and levels. The data obtained in case of two (2 × 2) simple factorial
studies may be as given in Fig. 3.9.

STUDY I DATA

Training

Control Level I (Low) Treatment Treatment Row
(Intelligence) A B Mean

15.5 23.3 19.4

Level II (High) 35.8 30.2 33.0

Column mean 25.6 26.7

STUDY II DATA

Training

Control Level I (Low) Treatment Treatment Row
(Intelligence) A B Mean

10.4 20.6 15.5

Level II (High) 30.6 40.4 35.5

Column mean 20.5 30.5

Fig. 3.9

All the above figures (the study I data and the study II data) represent the respective means.
Graphically, these can be represented as shown in Fig. 3.10.

Study I Study II
60 60

Mean scores of 50 50 B
dependent variables A
40 A 40
(say ability)
30 B B 30
20 B

10 A 20
A

10

0 (High) 0 (High)
(Low) II (Low) II
I I

Control level Control level
(Intelligence) (Intelligence)

Fig. 3.10

Research Design 49

The graph relating to Study I indicates that there is an interaction between the treatment and the
level which, in other words, means that the treatment and the level are not independent of each other.
The graph relating to Study II shows that there is no interaction effect which means that treatment
and level in this study are relatively independent of each other.

The 2 × 2 design need not be restricted in the manner as explained above i.e., having one
experimental variable and one control variable, but it may also be of the type having two experimental
variables or two control variables. For example, a college teacher compared the effect of the class-
size as well as the introduction of the new instruction technique on the learning of research methodology.
For this purpose he conducted a study using a 2 × 2 simple factorial design. His design in the graphic
form would be as follows:

Experimental Variable I
(Class Size)

Small Usual

Experimental Variable II New

(Instruction technique) Usual

Fig. 3.11

But if the teacher uses a design for comparing males and females and the senior and junior
students in the college as they relate to the knowledge of research methodology, in that case we will
have a 2 × 2 simple factorial design wherein both the variables are control variables as no manipulation
is involved in respect of both the variables.

Illustration 2: (4 × 3 simple factorial design).
The 4 × 3 simple factorial design will usually include four treatments of the experimental variable

and three levels of the control variable. Graphically it may take the following form:

4 × 3 SIMPLE FACTORIAL DESIGN

Experimental Variable

Control Treatment Treatment Treatment Treatment
Variable A B C D

Level I Cell 1 Cell 4 Cell 7 Cell 10

Level II Cell 2 Cell 5 Cell 8 Cell 11

Level III Cell 3 Cell 6 Cell 9 Cell 12

Fig. 3.12

This model of a simple factorial design includes four treatments viz., A, B, C, and D of the
experimental variable and three levels viz., I, II, and III of the control variable and has 12 different
cells as shown above. This shows that a 2 × 2 simple factorial design can be generalised to any
number of treatments and levels. Accordingly we can name it as such and such (–×–) design. In

5 0 Research Methodology

such a design the means for the columns provide the researcher with an estimate of the main effects
for treatments and the means for rows provide an estimate of the main effects for the levels. Such a
design also enables the researcher to determine the interaction between treatments and levels.

(ii) Complex factorial designs: Experiments with more than two factors at a time involve
the use of complex factorial designs. A design which considers three or more independent
variables simultaneously is called a complex factorial design. In case of three factors with
one experimental variable having two treatments and two control variables, each one of
which having two levels, the design used will be termed 2 × 2 × 2 complex factorial design
which will contain a total of eight cells as shown below in Fig. 3.13.

2 × 2 × 2 COMPLEX FACTORIAL DESIGN

Experimental Variable

Treatment A Treatment B

Control Control Control Control
Variable Variable Variable Variable
2 2 2 2
Level I Level II Level I Level II

Level I Cell 1 Cell 3 Cell 5 Cell 7
Control
Variable 1 Level II Cell 2 Cell 4 Cell 6 Cell 8

Fig. 3.13
In Fig. 3.14 a pictorial presentation is given of the design shown below.

Experimental Variable

VariCaoblnter2ol Treatment Treatment
A B

Level II
Level I

Level I

Control Variable I Level II

Fig. 3.14

Research Design 51

The dotted line cell in the diagram corresponds to Cell 1 of the above stated 2 × 2 × 2 design and
is for Treatment A, level I of the control variable 1, and level I of the control variable 2. From this
design it is possible to determine the main effects for three variables i.e., one experimental and two
control variables. The researcher can also determine the interactions between each possible pair of
variables (such interactions are called ‘First Order interactions’) and interaction between variable
taken in triplets (such interactions are called Second Order interactions). In case of a 2 × 2 × 2
design, the further given first order interactions are possible:

Experimental variable with control variable 1 (or EV × CV 1);

Experimental variable with control variable 2 (or EV × CV 2);

Control variable 1 with control variable 2 (or CV1 × CV2);

Three will be one second order interaction as well in the given design (it is between all the three
variables i.e., EV × CV1 × CV2).

To determine the main effects for the experimental variable, the researcher must necessarily
compare the combined mean of data in cells 1, 2, 3 and 4 for Treatment A with the combined mean
of data in cells 5, 6, 7 and 8 for Treatment B. In this way the main effect for experimental variable,
independent of control variable 1 and variable 2, is obtained. Similarly, the main effect for control
variable 1, independent of experimental variable and control variable 2, is obtained if we compare the
combined mean of data in cells 1, 3, 5 and 7 with the combined mean of data in cells 2, 4, 6 and 8 of
our 2 × 2 × 2 factorial design. On similar lines, one can determine the main effect for the control
variable 2 independent of experimental variable and control variable 1, if the combined mean of data
in cells 1, 2, 5 and 6 are compared with the combined mean of data in cells 3, 4, 7 and 8.

To obtain the first order interaction, say, for EV × CV1 in the above stated design, the researcher
must necessarily ignore control variable 2 for which purpose he may develop 2 × 2 design from the
2 × 2 × 2 design by combining the data of the relevant cells of the latter design as shown in Fig. 3.15.

Experimental Variables

Treatment A Treatment B

Control Level I Cells 1, 3 Cells 5, 7
Variable 1 Level II
Cells 2, 4 Cells 6, 8

Fig. 3.15

Similarly, the researcher can determine other first order interactions. The analysis of the first
order interaction, in the manner described above, is essentially a sample factorial analysis as only two
variables are considered at a time and the remaining one is ignored. But the analysis of the second
order interaction would not ignore one of the three independent variables in case of a 2 × 2 × 2
design. The analysis would be termed as a complex factorial analysis.

It may, however, be remembered that the complex factorial design need not necessarily be of
2 × 2 × 2 type design, but can be generalised to any number and combination of experimental and
control independent variables. Of course, the greater the number of independent variables included
in a complex factorial design, the higher the order of the interaction analysis possible. But the overall
task goes on becoming more and more complicated with the inclusion of more and more independent
variables in our design.

5 2 Research Methodology

Factorial designs are used mainly because of the two advantages. (i) They provide equivalent
accuracy (as happens in the case of experiments with only one factor) with less labour and as such
are a source of economy. Using factorial designs, we can determine the main effects of two (in
simple factorial design) or more (in case of complex factorial design) factors (or variables) in one
single experiment. (ii) They permit various other comparisons of interest. For example, they give
information about such effects which cannot be obtained by treating one single factor at a time. The
determination of interaction effects is possible in case of factorial designs.

CONCLUSION

There are several research designs and the researcher must decide in advance of collection and
analysis of data as to which design would prove to be more appropriate for his research project. He
must give due weight to various points such as the type of universe and its nature, the objective of his
study, the resource list or the sampling frame, desired standard of accuracy and the like when taking
a decision in respect of the design for his research project.

Questions

1. Explain the meaning and significance of a Research design.
2. Explain the meaning of the following in context of Research design.

(a) Extraneous variables;
(b) Confounded relationship;
(c) Research hypothesis;
(d) Experimental and Control groups;
(e) Treatments.
3. Describe some of the important research designs used in experimental hypothesis-testing research
study.
4. “Research design in exploratory studies must be flexible but in descriptive studies, it must minimise bias
and maximise reliability.” Discuss.
5. Give your understanding of a good research design. Is single research design suitable in all research
studies? If not, why?
6. Explain and illustrate the following research designs:
(a) Two group simple randomized design;
(b) Latin square design;
(c) Random replications design;
(d) Simple factorial design;
(e) Informal experimental designs.
7. Write a short note on ‘Experience Survey’ explaining fully its utility in exploratory research studies.
8. What is research design? Discuss the basis of stratification to be employed in sampling public opinion
on inflation.

(Raj. Uni. EAFM M. Phil, Exam. 1978)

Appendix: Developing a Research Plan 53

Appendix

Developing a Research Plan*

After identifying and defining the problem as also accomplishing the relating task, researcher must
arrange his ideas in order and write them in the form of an experimental plan or what can be
described as ‘Research Plan’. This is essential specially for new researcher because of the following:

(a) It helps him to organize his ideas in a form whereby it will be possible for him to look for
flaws and inadequacies, if any.

(b) It provides an inventory of what must be done and which materials have to be collected as
a preliminary step.

(c) It is a document that can be given to others for comment.

Research plan must contain the following items.

1. Research objective should be clearly stated in a line or two which tells exactly what it is
that the researcher expects to do.

2. The problem to be studied by researcher must be explicitly stated so that one may know
what information is to be obtained for solving the problem.

3. Each major concept which researcher wants to measure should be defined in operational
terms in context of the research project.

4. The plan should contain the method to be used in solving the problem. An overall description
of the approach to be adopted is usually given and assumptions, if any, of the concerning
method to be used are clearly mentioned in the research plan.

5. The plan must also state the details of the techniques to be adopted. For instance, if interview
method is to be used, an account of the nature of the contemplated interview procedure
should be given. Similarly, if tests are to be given, the conditions under which they are to be
administered should be specified along with the nature of instruments to be used. If public
records are to be consulted as sources of data, the fact should be recorded in the research
plan. Procedure for quantifying data should also be written out in all details.

* Based on the matter given in the following two books:
(i) Robert M.W. Travers, An Introduction to Educational Research, p. 82–84.
(ii) C. William Emory, Business Research Methods, p. 415–416.

5 4 Research Methodology

6. A clear mention of the population to be studied should be made. If the study happens to be
sample based, the research plan should state the sampling plan i.e., how the sample is to be
identified. The method of identifying the sample should be such that generalisation from the
sample to the original population is feasible.

7. The plan must also contain the methods to be used in processing the data. Statistical and
other methods to be used must be indicated in the plan. Such methods should not be left
until the data have been collected. This part of the plan may be reviewed by experts in the
field, for they can often suggest changes that result in substantial saving of time and effort.

8. Results of pilot test, if any, should be reported. Time and cost budgets for the research
project should also be prepared and laid down in the plan itself.

Sampling Design 55

4

Sampling Design

CENSUS AND SAMPLE SURVEY

All items in any field of inquiry constitute a ‘Universe’ or ‘Population.’ A complete enumeration of
all items in the ‘population’ is known as a census inquiry. It can be presumed that in such an inquiry,
when all items are covered, no element of chance is left and highest accuracy is obtained. But in
practice this may not be true. Even the slightest element of bias in such an inquiry will get larger and
larger as the number of observation increases. Moreover, there is no way of checking the element of
bias or its extent except through a resurvey or use of sample checks. Besides, this type of inquiry
involves a great deal of time, money and energy. Therefore, when the field of inquiry is large, this
method becomes difficult to adopt because of the resources involved. At times, this method is practically
beyond the reach of ordinary researchers. Perhaps, government is the only institution which can get
the complete enumeration carried out. Even the government adopts this in very rare cases such as
population census conducted once in a decade. Further, many a time it is not possible to examine
every item in the population, and sometimes it is possible to obtain sufficiently accurate results by
studying only a part of total population. In such cases there is no utility of census surveys.

However, it needs to be emphasised that when the universe is a small one, it is no use resorting
to a sample survey. When field studies are undertaken in practical life, considerations of time and
cost almost invariably lead to a selection of respondents i.e., selection of only a few items. The
respondents selected should be as representative of the total population as possible in order to produce
a miniature cross-section. The selected respondents constitute what is technically called a ‘sample’
and the selection process is called ‘sampling technique.’ The survey so conducted is known as
‘sample survey’. Algebraically, let the population size be N and if a part of size n (which is < N) of
this population is selected according to some rule for studying some characteristic of the population,
the group consisting of these n units is known as ‘sample’. Researcher must prepare a sample design
for his study i.e., he must plan how a sample should be selected and of what size such a sample would be.

IMPLICATIONS OF A SAMPLE DESIGN

A sample design is a definite plan for obtaining a sample from a given population. It refers to the
technique or the procedure the researcher would adopt in selecting items for the sample. Sample

5 6 Research Methodology

design may as well lay down the number of items to be included in the sample i.e., the size of the
sample. Sample design is determined before data are collected. There are many sample designs
from which a researcher can choose. Some designs are relatively more precise and easier to apply
than others. Researcher must select/prepare a sample design which should be reliable and appropriate
for his research study.

STEPS IN SAMPLE DESIGN

While developing a sampling design, the researcher must pay attention to the following points:

(i) Type of universe: The first step in developing any sample design is to clearly define the
set of objects, technically called the Universe, to be studied. The universe can be finite or
infinite. In finite universe the number of items is certain, but in case of an infinite universe
the number of items is infinite, i.e., we cannot have any idea about the total number of
items. The population of a city, the number of workers in a factory and the like are examples
of finite universes, whereas the number of stars in the sky, listeners of a specific radio
programme, throwing of a dice etc. are examples of infinite universes.

(ii) Sampling unit: A decision has to be taken concerning a sampling unit before selecting
sample. Sampling unit may be a geographical one such as state, district, village, etc., or a
construction unit such as house, flat, etc., or it may be a social unit such as family, club,
school, etc., or it may be an individual. The researcher will have to decide one or more of
such units that he has to select for his study.

(iii) Source list: It is also known as ‘sampling frame’ from which sample is to be drawn. It
contains the names of all items of a universe (in case of finite universe only). If source list
is not available, researcher has to prepare it. Such a list should be comprehensive, correct,
reliable and appropriate. It is extremely important for the source list to be as representative
of the population as possible.

(iv) Size of sample: This refers to the number of items to be selected from the universe to
constitute a sample. This a major problem before a researcher. The size of sample should
neither be excessively large, nor too small. It should be optimum. An optimum sample is
one which fulfills the requirements of efficiency, representativeness, reliability and flexibility.
While deciding the size of sample, researcher must determine the desired precision as also
an acceptable confidence level for the estimate. The size of population variance needs to
be considered as in case of larger variance usually a bigger sample is needed. The size of
population must be kept in view for this also limits the sample size. The parameters of
interest in a research study must be kept in view, while deciding the size of the sample.
Costs too dictate the size of sample that we can draw. As such, budgetary constraint must
invariably be taken into consideration when we decide the sample size.

(v) Parameters of interest: In determining the sample design, one must consider the question
of the specific population parameters which are of interest. For instance, we may be
interested in estimating the proportion of persons with some characteristic in the population,
or we may be interested in knowing some average or the other measure concerning the
population. There may also be important sub-groups in the population about whom we

Sampling Design 57

would like to make estimates. All this has a strong impact upon the sample design we
would accept.

(vi) Budgetary constraint: Cost considerations, from practical point of view, have a major
impact upon decisions relating to not only the size of the sample but also to the type of
sample. This fact can even lead to the use of a non-probability sample.

(vii) Sampling procedure: Finally, the researcher must decide the type of sample he will use
i.e., he must decide about the technique to be used in selecting the items for the sample. In
fact, this technique or procedure stands for the sample design itself. There are several
sample designs (explained in the pages that follow) out of which the researcher must
choose one for his study. Obviously, he must select that design which, for a given sample
size and for a given cost, has a smaller sampling error.

CRITERIA OF SELECTING A SAMPLING PROCEDURE

In this context one must remember that two costs are involved in a sampling analysis viz., the cost of
collecting the data and the cost of an incorrect inference resulting from the data. Researcher must
keep in view the two causes of incorrect inferences viz., systematic bias and sampling error. A
systematic bias results from errors in the sampling procedures, and it cannot be reduced or eliminated
by increasing the sample size. At best the causes responsible for these errors can be detected and
corrected. Usually a systematic bias is the result of one or more of the following factors:

1. Inappropriate sampling frame: If the sampling frame is inappropriate i.e., a biased representation
of the universe, it will result in a systematic bias.

2. Defective measuring device: If the measuring device is constantly in error, it will result in
systematic bias. In survey work, systematic bias can result if the questionnaire or the interviewer is
biased. Similarly, if the physical measuring device is defective there will be systematic bias in the
data collected through such a measuring device.

3. Non-respondents: If we are unable to sample all the individuals initially included in the sample,
there may arise a systematic bias. The reason is that in such a situation the likelihood of establishing
contact or receiving a response from an individual is often correlated with the measure of what is to
be estimated.

4. Indeterminancy principle: Sometimes we find that individuals act differently when kept under
observation than what they do when kept in non-observed situations. For instance, if workers are
aware that somebody is observing them in course of a work study on the basis of which the average
length of time to complete a task will be determined and accordingly the quota will be set for piece
work, they generally tend to work slowly in comparison to the speed with which they work if kept
unobserved. Thus, the indeterminancy principle may also be a cause of a systematic bias.

5. Natural bias in the reporting of data: Natural bias of respondents in the reporting of data is
often the cause of a systematic bias in many inquiries. There is usually a downward bias in the
income data collected by government taxation department, whereas we find an upward bias in the
income data collected by some social organisation. People in general understate their incomes if
asked about it for tax purposes, but they overstate the same if asked for social status or their affluence.
Generally in psychological surveys, people tend to give what they think is the ‘correct’ answer rather
than revealing their true feelings.

5 8 Research Methodology

Sampling errors are the random variations in the sample estimates around the true population
parameters. Since they occur randomly and are equally likely to be in either direction, their nature
happens to be of compensatory type and the expected value of such errors happens to be equal to
zero. Sampling error decreases with the increase in the size of the sample, and it happens to be of a
smaller magnitude in case of homogeneous population.

Sampling error can be measured for a given sample design and size. The measurement of
sampling error is usually called the ‘precision of the sampling plan’. If we increase the sample size,
the precision can be improved. But increasing the size of the sample has its own limitations viz., a
large sized sample increases the cost of collecting data and also enhances the systematic bias. Thus
the effective way to increase precision is usually to select a better sampling design which has a
smaller sampling error for a given sample size at a given cost. In practice, however, people prefer a
less precise design because it is easier to adopt the same and also because of the fact that systematic
bias can be controlled in a better way in such a design.

In brief, while selecting a sampling procedure, researcher must ensure that the procedure
causes a relatively small sampling error and helps to control the systematic bias in a better
way.

CHARACTERISTICS OF A GOOD SAMPLE DESIGN

From what has been stated above, we can list down the characteristics of a good sample design as
under:

(a) Sample design must result in a truly representative sample.
(b) Sample design must be such which results in a small sampling error.
(c) Sample design must be viable in the context of funds available for the research study.
(d) Sample design must be such so that systematic bias can be controlled in a better way.
(e) Sample should be such that the results of the sample study can be applied, in general, for

the universe with a reasonable level of confidence.

DIFFERENT TYPES OF SAMPLE DESIGNS

There are different types of sample designs based on two factors viz., the representation basis and
the element selection technique. On the representation basis, the sample may be probability sampling
or it may be non-probability sampling. Probability sampling is based on the concept of random selection,
whereas non-probability sampling is ‘non-random’ sampling. On element selection basis, the sample
may be either unrestricted or restricted. When each sample element is drawn individually from the
population at large, then the sample so drawn is known as ‘unrestricted sample’, whereas all other
forms of sampling are covered under the term ‘restricted sampling’. The following chart exhibits the
sample designs as explained above.

Thus, sample designs are basically of two types viz., non-probability sampling and probability
sampling. We take up these two designs separately.

Sampling Design 59

CHART SHOWING BASIC SAMPLING DESIGNS
Representation basis

Element selection Probability sampling Non-probability sampling
technique

Unrestricted sampling Simple random sampling Haphazard sampling or
convenience sampling

Restricted sampling Complex random sampling Purposive sampling (such as
(such as cluster sampling, quota sampling, judgement
systematic sampling, sampling)
stratified sampling etc.)

Fig. 4.1

Non-probability sampling: Non-probability sampling is that sampling procedure which does
not afford any basis for estimating the probability that each item in the population has of being
included in the sample. Non-probability sampling is also known by different names such as deliberate
sampling, purposive sampling and judgement sampling. In this type of sampling, items for the sample
are selected deliberately by the researcher; his choice concerning the items remains supreme. In
other words, under non-probability sampling the organisers of the inquiry purposively choose the
particular units of the universe for constituting a sample on the basis that the small mass that they so
select out of a huge one will be typical or representative of the whole. For instance, if economic
conditions of people living in a state are to be studied, a few towns and villages may be purposively
selected for intensive study on the principle that they can be representative of the entire state. Thus,
the judgement of the organisers of the study plays an important part in this sampling design.

In such a design, personal element has a great chance of entering into the selection of the
sample. The investigator may select a sample which shall yield results favourable to his point of view
and if that happens, the entire inquiry may get vitiated. Thus, there is always the danger of bias
entering into this type of sampling technique. But in the investigators are impartial, work without bias
and have the necessary experience so as to take sound judgement, the results obtained from an
analysis of deliberately selected sample may be tolerably reliable. However, in such a sampling, there
is no assurance that every element has some specifiable chance of being included. Sampling error in
this type of sampling cannot be estimated and the element of bias, great or small, is always there. As
such this sampling design in rarely adopted in large inquires of importance. However, in small inquiries
and researches by individuals, this design may be adopted because of the relative advantage of time
and money inherent in this method of sampling. Quota sampling is also an example of non-probability
sampling. Under quota sampling the interviewers are simply given quotas to be filled from the different
strata, with some restrictions on how they are to be filled. In other words, the actual selection of the
items for the sample is left to the interviewer’s discretion. This type of sampling is very convenient
and is relatively inexpensive. But the samples so selected certainly do not possess the characteristic
of random samples. Quota samples are essentially judgement samples and inferences drawn on their
basis are not amenable to statistical treatment in a formal way.

6 0 Research Methodology

Probability sampling: Probability sampling is also known as ‘random sampling’ or ‘chance
sampling’. Under this sampling design, every item of the universe has an equal chance of inclusion in
the sample. It is, so to say, a lottery method in which individual units are picked up from the whole
group not deliberately but by some mechanical process. Here it is blind chance alone that determines
whether one item or the other is selected. The results obtained from probability or random sampling
can be assured in terms of probability i.e., we can measure the errors of estimation or the significance
of results obtained from a random sample, and this fact brings out the superiority of random sampling
design over the deliberate sampling design. Random sampling ensures the law of Statistical Regularity
which states that if on an average the sample chosen is a random one, the sample will have the same
composition and characteristics as the universe. This is the reason why random sampling is considered
as the best technique of selecting a representative sample.

Random sampling from a finite population refers to that method of sample selection which gives
each possible sample combination an equal probability of being picked up and each item in the entire
population to have an equal chance of being included in the sample. This applies to sampling without
replacement i.e., once an item is selected for the sample, it cannot appear in the sample again
(Sampling with replacement is used less frequently in which procedure the element selected for the
sample is returned to the population before the next element is selected. In such a situation the same
element could appear twice in the same sample before the second element is chosen). In brief, the
implications of random sampling (or simple random sampling) are:

(a) It gives each element in the population an equal probability of getting into the sample; and
all choices are independent of one another.

(b) It gives each possible sample combination an equal probability of being chosen.

Keeping this in view we can define a simple random sample (or simply a random sample) from
a finite population as a sample which is chosen in such a way that each of the NCn possible samples
has the same probability, 1/NC , of being selected. To make it more clear we take a certain finite

n

population consisting of six elements (say a, b, c, d, e, f ) i.e., N = 6. Suppose that we want to take a
sample of size n = 3 from it. Then there are 6C = 20 possible distinct samples of the required size,

3

and they consist of the elements abc, abd, abe, abf, acd, ace, acf, ade, adf, aef, bcd, bce, bcf, bde,
bdf, bef, cde, cdf, cef, and def. If we choose one of these samples in such a way that each has the
probability 1/20 of being chosen, we will then call this a random sample.

HOW TO SELECT A RANDOM SAMPLE ?

With regard to the question of how to take a random sample in actual practice, we could, in simple
cases like the one above, write each of the possible samples on a slip of paper, mix these slips
thoroughly in a container and then draw as a lottery either blindfolded or by rotating a drum or by any
other similar device. Such a procedure is obviously impractical, if not altogether impossible in complex
problems of sampling. In fact, the practical utility of such a method is very much limited.

Fortunately, we can take a random sample in a relatively easier way without taking the trouble of
enlisting all possible samples on paper-slips as explained above. Instead of this, we can write the
name of each element of a finite population on a slip of paper, put the slips of paper so prepared into
a box or a bag and mix them thoroughly and then draw (without looking) the required number of slips
for the sample one after the other without replacement. In doing so we must make sure that in

Sampling Design 61

successive drawings each of the remaining elements of the population has the same chance of being
selected. This procedure will also result in the same probability for each possible sample. We can
verify this by taking the above example. Since we have a finite population of 6 elements and we want
to select a sample of size 3, the probability of drawing any one element for our sample in the first
draw is 3/6, the probability of drawing one more element in the second draw is 2/5, (the first element
drawn is not replaced) and similarly the probability of drawing one more element in the third draw is
1/4. Since these draws are independent, the joint probability of the three elements which constitute
our sample is the product of their individual probabilities and this works out to 3/6 × 2/5 × 1/4 = 1/20.
This verifies our earlier calculation.

Even this relatively easy method of obtaining a random sample can be simplified in actual practice
by the use of random number tables. Various statisticians like Tippett, Yates, Fisher have prepared
tables of random numbers which can be used for selecting a random sample. Generally, Tippett’s
random number tables are used for the purpose. Tippett gave10400 four figure numbers. He selected
41600 digits from the census reports and combined them into fours to give his random numbers
which may be used to obtain a random sample.

We can illustrate the procedure by an example. First of all we reproduce the first thirty sets of
Tippett’s numbers

2952 6641 3992 9792 7979 5911

3170 5624 4167 9525 1545 1396

7203 5356 1300 2693 2370 7483

3408 2769 3563 6107 6913 7691

0560 5246 1112 9025 6008 8126

Suppose we are interested in taking a sample of 10 units from a population of 5000 units, bearing
numbers from 3001 to 8000. We shall select 10 such figures from the above random numbers which
are not less than 3001 and not greater than 8000. If we randomly decide to read the table numbers
from left to right, starting from the first row itself, we obtain the following numbers: 6641, 3992, 7979,
5911, 3170, 5624, 4167, 7203, 5356, and 7483.

The units bearing the above serial numbers would then constitute our required random sample.

One may note that it is easy to draw random samples from finite populations with the aid of
random number tables only when lists are available and items are readily numbered. But in some
situations it is often impossible to proceed in the way we have narrated above. For example, if we
want to estimate the mean height of trees in a forest, it would not be possible to number the trees, and
choose random numbers to select a random sample. In such situations what we should do is to select
some trees for the sample haphazardly without aim or purpose, and should treat the sample as a
random sample for study purposes.

RANDOM SAMPLE FROM AN INFINITE UNIVERSE

So far we have talked about random sampling, keeping in view only the finite populations. But what
about random sampling in context of infinite populations? It is relatively difficult to explain the concept
of random sample from an infinite population. However, a few examples will show the basic
characteristic of such a sample. Suppose we consider the 20 throws of a fair dice as a sample from
the hypothetically infinite population which consists of the results of all possible throws of the dice. If

6 2 Research Methodology

the probability of getting a particular number, say 1, is the same for each throw and the 20 throws are
all independent, then we say that the sample is random. Similarly, it would be said to be sampling from
an infinite population if we sample with replacement from a finite population and our sample would
be considered as a random sample if in each draw all elements of the population have the same
probability of being selected and successive draws happen to be independent. In brief, one can say
that the selection of each item in a random sample from an infinite population is controlled by the
same probabilities and that successive selections are independent of one another.

COMPLEX RANDOM SAMPLING DESIGNS

Probability sampling under restricted sampling techniques, as stated above, may result in complex
random sampling designs. Such designs may as well be called ‘mixed sampling designs’ for many of
such designs may represent a combination of probability and non-probability sampling procedures in
selecting a sample. Some of the popular complex random sampling designs are as follows:

(i) Systematic sampling: In some instances, the most practical way of sampling is to select every
ith item on a list. Sampling of this type is known as systematic sampling. An element of randomness
is introduced into this kind of sampling by using random numbers to pick up the unit with which to
start. For instance, if a 4 per cent sample is desired, the first item would be selected randomly from
the first twenty-five and thereafter every 25th item would automatically be included in the sample.
Thus, in systematic sampling only the first unit is selected randomly and the remaining units of the
sample are selected at fixed intervals. Although a systematic sample is not a random sample in the
strict sense of the term, but it is often considered reasonable to treat systematic sample as if it were
a random sample.

Systematic sampling has certain plus points. It can be taken as an improvement over a simple
random sample in as much as the systematic sample is spread more evenly over the entire population.
It is an easier and less costlier method of sampling and can be conveniently used even in case of
large populations. But there are certain dangers too in using this type of sampling. If there is a hidden
periodicity in the population, systematic sampling will prove to be an inefficient method of sampling.
For instance, every 25th item produced by a certain production process is defective. If we are to
select a 4% sample of the items of this process in a systematic manner, we would either get all
defective items or all good items in our sample depending upon the random starting position. If all
elements of the universe are ordered in a manner representative of the total population, i.e., the
population list is in random order, systematic sampling is considered equivalent to random sampling.
But if this is not so, then the results of such sampling may, at times, not be very reliable. In practice,
systematic sampling is used when lists of population are available and they are of considerable
length.
(ii) Stratified sampling: If a population from which a sample is to be drawn does not constitute a
homogeneous group, stratified sampling technique is generally applied in order to obtain a representative
sample. Under stratified sampling the population is divided into several sub-populations that are
individually more homogeneous than the total population (the different sub-populations are called
‘strata’) and then we select items from each stratum to constitute a sample. Since each stratum is
more homogeneous than the total population, we are able to get more precise estimates for each
stratum and by estimating more accurately each of the component parts, we get a better estimate of
the whole. In brief, stratified sampling results in more reliable and detailed information.

Sampling Design 63

The following three questions are highly relevant in the context of stratified sampling:

(a) How to form strata?

(b) How should items be selected from each stratum?

(c) How many items be selected from each stratum or how to allocate the sample size of each
stratum?

Regarding the first question, we can say that the strata be formed on the basis of common
characteristic(s) of the items to be put in each stratum. This means that various strata be formed in
such a way as to ensure elements being most homogeneous within each stratum and most
heterogeneous between the different strata. Thus, strata are purposively formed and are usually
based on past experience and personal judgement of the researcher. One should always remember
that careful consideration of the relationship between the characteristics of the population and the
characteristics to be estimated are normally used to define the strata. At times, pilot study may be
conducted for determining a more appropriate and efficient stratification plan. We can do so by
taking small samples of equal size from each of the proposed strata and then examining the variances
within and among the possible stratifications, we can decide an appropriate stratification plan for our
inquiry.

In respect of the second question, we can say that the usual method, for selection of items for the
sample from each stratum, resorted to is that of simple random sampling. Systematic sampling can
be used if it is considered more appropriate in certain situations.

Regarding the third question, we usually follow the method of proportional allocation under which

the sizes of the samples from the different strata are kept proportional to the sizes of the strata. That

is, if P represents the proportion of population included in stratum i, and n represents the total sample
i

size, the number of elements selected from stratum i is n . Pi. To illustrate it, let us suppose that we
want a sample of size n = 30 to be drawn from a population of size N = 8000 which is divided into

three strata of size N = 4000, N = 2400 and N = 1600. Adopting proportional allocation, we shall
12 3

get the sample sizes as under for the different strata:

For strata with N1 = 4000, we have P1 = 4000/8000
and hence n1 = n . P1 = 30 (4000/8000) = 15

Similarly, for strata with N = 2400, we have
2
n2 = n . P2 = 30 (2400/8000) = 9, and

for strata with N3 = 1600, we have
n = n . P = 30 (1600/8000) = 6.

33

Thus, using proportional allocation, the sample sizes for different strata are 15, 9 and 6 respectively
which is in proportion to the sizes of the strata viz., 4000 : 2400 : 1600. Proportional allocation is
considered most efficient and an optimal design when the cost of selecting an item is equal for each
stratum, there is no difference in within-stratum variances, and the purpose of sampling happens to
be to estimate the population value of some characteristic. But in case the purpose happens to be to
compare the differences among the strata, then equal sample selection from each stratum would be
more efficient even if the strata differ in sizes. In cases where strata differ not only in size but also
in variability and it is considered reasonable to take larger samples from the more variable strata and
smaller samples from the less variable strata, we can then account for both (differences in stratum
size and differences in stratum variability) by using disproportionate sampling design by requiring:

6 4 Research Methodology

n1/ N1σ1 = n2 / N 2σ2 = ......... = nk / N k σ k

where σ1, σ2 , ... and σk denote the standard deviations of the k strata, N1, N2,…, Nk denote the
sizes of the k strata and n1, n2,…, nk denote the sample sizes of k strata. This is called ‘optimum
allocation’ in the context of disproportionate sampling. The allocation in such a situation results in

the following formula for determining the sample sizes different strata:

ni = N1 σ1 + n ⋅ Ni σi Nk σk for i = 1, 2, … and k.
N2 σ2 +...+

We may illustrate the use of this by an example.

Illustration 1
A population is divided into three strata so that N1 = 5000, N2 = 2000 and N3 = 3000. Respective

standard deviations are:

σ1 = 15, σ2 = 18 and σ3 = 5.

How should a sample of size n = 84 be allocated to the three strata, if we want optimum allocation
using disproportionate sampling design?

Solution: Using the disproportionate sampling design for optimum allocation, the sample sizes for
different strata will be determined as under:

Sample size for strata with N = 5000
1
b g b g84 5000 15
n1 = b g5000 b15g + b g2000 b18g + b g3000 b5g

= 6300000/126000 = 50

Sample size for strata with N2 = 2000

b g b g84 2000 18
n2 = b g5000 b15g + b g2000 b18g + b g3000 b5g

= 3024000/126000 = 24

Sample size for strata with N3 = 3000

b g84 3000 b5g
n3 = b g5000 b15g + b g2000 b18g + b g3000 b5g

= 1260000/126000 = 10

In addition to differences in stratum size and differences in stratum variability, we may have
differences in stratum sampling cost, then we can have cost optimal disproportionate sampling design
by requiring

n1 = n2 = ... = nk
N1 σ1 C1 N2 σ2 C2 N k σ k Ck

Sampling Design 65

where

C1 = Cost of sampling in stratum 1
C2 = Cost of sampling in stratum 2
Ck = Cost of sampling in stratum k
and all other terms remain the same as explained earlier. The allocation in such a situation results in
the following formula for determining the sample sizes for different strata:

ni = N1σ1/ n ⋅ Niσi / Ci Ck for i = 1, 2, ..., k
C1 + N2σ2 / C2 + ...+ N k σ k /

It is not necessary that stratification be done keeping in view a single characteristic. Populations
are often stratified according to several characteristics. For example, a system-wide survey designed
to determine the attitude of students toward a new teaching plan, a state college system with 20
colleges might stratify the students with respect to class, sec and college. Stratification of this type is
known as cross-stratification, and up to a point such stratification increases the reliability of estimates
and is much used in opinion surveys.

From what has been stated above in respect of stratified sampling, we can say that the sample so
constituted is the result of successive application of purposive (involved in stratification of items) and
random sampling methods. As such it is an example of mixed sampling. The procedure wherein we
first have stratification and then simple random sampling is known as stratified random sampling.

(iii) Cluster sampling: If the total area of interest happens to be a big one, a convenient way in
which a sample can be taken is to divide the area into a number of smaller non-overlapping areas and
then to randomly select a number of these smaller areas (usually called clusters), with the ultimate
sample consisting of all (or samples of) units in these small areas or clusters.

Thus in cluster sampling the total population is divided into a number of relatively small subdivisions
which are themselves clusters of still smaller units and then some of these clusters are randomly
selected for inclusion in the overall sample. Suppose we want to estimate the proportion of machine-
parts in an inventory which are defective. Also assume that there are 20000 machine parts in the
inventory at a given point of time, stored in 400 cases of 50 each. Now using a cluster sampling, we
would consider the 400 cases as clusters and randomly select ‘n’ cases and examine all the machine-
parts in each randomly selected case.

Cluster sampling, no doubt, reduces cost by concentrating surveys in selected clusters. But
certainly it is less precise than random sampling. There is also not as much information in ‘n’
observations within a cluster as there happens to be in ‘n’ randomly drawn observations. Cluster
sampling is used only because of the economic advantage it possesses; estimates based on cluster
samples are usually more reliable per unit cost.

(iv) Area sampling: If clusters happen to be some geographic subdivisions, in that case cluster
sampling is better known as area sampling. In other words, cluster designs, where the primary
sampling unit represents a cluster of units based on geographic area, are distinguished as area sampling.
The plus and minus points of cluster sampling are also applicable to area sampling.

(v) Multi-stage sampling: Multi-stage sampling is a further development of the principle of cluster
sampling. Suppose we want to investigate the working efficiency of nationalised banks in India and
we want to take a sample of few banks for this purpose. The first stage is to select large primary

6 6 Research Methodology

sampling unit such as states in a country. Then we may select certain districts and interview all banks
in the chosen districts. This would represent a two-stage sampling design with the ultimate sampling
units being clusters of districts.

If instead of taking a census of all banks within the selected districts, we select certain towns and
interview all banks in the chosen towns. This would represent a three-stage sampling design. If
instead of taking a census of all banks within the selected towns, we randomly sample banks from
each selected town, then it is a case of using a four-stage sampling plan. If we select randomly at all
stages, we will have what is known as ‘multi-stage random sampling design’.

Ordinarily multi-stage sampling is applied in big inquires extending to a considerable large
geographical area, say, the entire country. There are two advantages of this sampling design viz.,
(a) It is easier to administer than most single stage designs mainly because of the fact that sampling
frame under multi-stage sampling is developed in partial units. (b) A large number of units can be
sampled for a given cost under multistage sampling because of sequential clustering, whereas this is
not possible in most of the simple designs.
(vi) Sampling with probability proportional to size: In case the cluster sampling units do not
have the same number or approximately the same number of elements, it is considered appropriate to
use a random selection process where the probability of each cluster being included in the sample is
proportional to the size of the cluster. For this purpose, we have to list the number of elements in each
cluster irrespective of the method of ordering the cluster. Then we must sample systematically the
appropriate number of elements from the cumulative totals. The actual numbers selected in this way
do not refer to individual elements, but indicate which clusters and how many from the cluster are to
be selected by simple random sampling or by systematic sampling. The results of this type of sampling
are equivalent to those of a simple random sample and the method is less cumbersome and is also
relatively less expensive. We can illustrate this with the help of an example.

Illustration 2
The following are the number of departmental stores in 15 cities: 35, 17, 10, 32, 70, 28, 26, 19, 26,

66, 37, 44, 33, 29 and 28. If we want to select a sample of 10 stores, using cities as clusters and
selecting within clusters proportional to size, how many stores from each city should be chosen?
(Use a starting point of 10).

Solution: Let us put the information as under (Table 4.1):
Since in the given problem, we have 500 departmental stores from which we have to select a

sample of 10 stores, the appropriate sampling interval is 50. As we have to use the starting point of
10*, so we add successively increments of 50 till 10 numbers have been selected. The numbers, thus,
obtained are: 10, 60, 110, 160, 210, 260, 310, 360, 410 and 460 which have been shown in the last
column of the table (Table 4.1) against the concerning cumulative totals. From this we can say that
two stores should be selected randomly from city number five and one each from city number 1, 3, 7,
9, 10, 11, 12, and 14. This sample of 10 stores is the sample with probability proportional to size.

*If the starting point is not mentioned, then the same can randomly be selected.

Sampling Design Table 4.1 67

City number No. of departmental stores Cumulative total Sample
10
1 35 35 60
2 17 52 110 160
3 10 62 210
4 32 94 260
5 70 164 310
6 28 192 360
7 26 218 410
8 19 237 460
9 26 263
10 66 329
11 37 366
12 44 410
13 33 443
14 29 472
15 28 500

(vii) Sequential sampling: This sampling design is some what complex sample design. The ultimate
size of the sample under this technique is not fixed in advance, but is determined according to
mathematical decision rules on the basis of information yielded as survey progresses. This is usually
adopted in case of acceptance sampling plan in context of statistical quality control. When a particular
lot is to be accepted or rejected on the basis of a single sample, it is known as single sampling; when
the decision is to be taken on the basis of two samples, it is known as double sampling and in case the
decision rests on the basis of more than two samples but the number of samples is certain and
decided in advance, the sampling is known as multiple sampling. But when the number of samples is
more than two but it is neither certain nor decided in advance, this type of system is often referred to
as sequential sampling. Thus, in brief, we can say that in sequential sampling, one can go on taking
samples one after another as long as one desires to do so.

CONCLUSION

From a brief description of the various sample designs presented above, we can say that normally
one should resort to simple random sampling because under it bias is generally eliminated and the
sampling error can be estimated. But purposive sampling is considered more appropriate when the
universe happens to be small and a known characteristic of it is to be studied intensively. There are
situations in real life under which sample designs other than simple random samples may be considered
better (say easier to obtain, cheaper or more informative) and as such the same may be used. In a
situation when random sampling is not possible, then we have to use necessarily a sampling design
other than random sampling. At times, several methods of sampling may well be used in the same
study.

6 8 Research Methodology

Questions

1. What do you mean by ‘Sample Design’? What points should be taken into consideration by a researcher
in developing a sample design for this research project.

2. How would you differentiate between simple random sampling and complex random sampling designs?
Explain clearly giving examples.

3. Why probability sampling is generally preferred in comparison to non-probability sampling? Explain the
procedure of selecting a simple random sample.

4. Under what circumstances stratified random sampling design is considered appropriate? How would you
select such sample? Explain by means of an example.

5. Distinguish between:

(a) Restricted and unrestricted sampling;

(b) Convenience and purposive sampling;

(c) Systematic and stratified sampling;

(d) Cluster and area sampling.

6. Under what circumstances would you recommend:

(a) A probability sample?

(b) A non-probability sample?

(c) A stratified sample?

(d) A cluster sample?

7. Explain and illustrate the procedure of selecting a random sample.

8. “A systematic bias results from errors in the sampling procedures”. What do you mean by such a
systematic bias? Describe the important causes responsible for such a bias.

9. (a) The following are the number of departmental stores in 10 cities: 35, 27, 24, 32, 42, 30, 34, 40, 29 and 38.
If we want to select a sample of 15 stores using cities as clusters and selecting within clusters proportional
to size, how many stores from each city should be chosen? (Use a starting point of 4).

(b)What sampling design might be used to estimate the weight of a group of men and women?

10. A certain population is divided into five strata so that N1 = 2000, N2 = 2000, N3 = 1800, N4 = 1700, and

N = 2500. Respective standard deviations are: σ1 = 1.6 , σ 2 = 2.0 , σ 3 = 4.4 , σ4 = 4.8 , σ5 = 6.0
5

and further the expected sampling cost in the first two strata is Rs 4 per interview and in the remaining

three strata the sampling cost is Rs 6 per interview. How should a sample of size n = 226 be allocated to

five strata if we adopt proportionate sampling design; if we adopt disproportionate sampling design

considering (i) only the differences in stratum variability (ii) differences in stratum variability as well as

the differences in stratum sampling costs.

Measurement and Scaling Techniques 69

5

Measurement and Scaling
Techniques

MEASUREMENT IN RESEARCH

In our daily life we are said to measure when we use some yardstick to determine weight, height, or
some other feature of a physical object. We also measure when we judge how well we like a song,
a painting or the personalities of our friends. We, thus, measure physical objects as well as abstract
concepts. Measurement is a relatively complex and demanding task, specially so when it concerns
qualitative or abstract phenomena. By measurement we mean the process of assigning numbers to
objects or observations, the level of measurement being a function of the rules under which the
numbers are assigned.

It is easy to assign numbers in respect of properties of some objects, but it is relatively difficult in
respect of others. For instance, measuring such things as social conformity, intelligence, or marital
adjustment is much less obvious and requires much closer attention than measuring physical weight,
biological age or a person’s financial assets. In other words, properties like weight, height, etc., can
be measured directly with some standard unit of measurement, but it is not that easy to measure
properties like motivation to succeed, ability to stand stress and the like. We can expect high accuracy
in measuring the length of pipe with a yard stick, but if the concept is abstract and the measurement
tools are not standardized, we are less confident about the accuracy of the results of measurement.

Technically speaking, measurement is a process of mapping aspects of a domain onto other
aspects of a range according to some rule of correspondence. In measuring, we devise some form of
scale in the range (in terms of set theory, range may refer to some set) and then transform or map the
properties of objects from the domain (in terms of set theory, domain may refer to some other set)
onto this scale. For example, in case we are to find the male to female attendance ratio while
conducting a study of persons who attend some show, then we may tabulate those who come to the
show according to sex. In terms of set theory, this process is one of mapping the observed physical
properties of those coming to the show (the domain) on to a sex classification (the range). The rule
of correspondence is: If the object in the domain appears to be male, assign to “0” and if female
assign to “1”. Similarly, we can record a person’s marital status as 1, 2, 3 or 4, depending on whether

7 0 Research Methodology

the person is single, married, widowed or divorced. We can as well record “Yes or No” answers to
a question as “0” and “1” (or as 1 and 2 or perhaps as 59 and 60). In this artificial or nominal way,
categorical data (qualitative or descriptive) can be made into numerical data and if we thus code the
various categories, we refer to the numbers we record as nominal data. Nominal data are numerical
in name only, because they do not share any of the properties of the numbers we deal in ordinary
arithmetic. For instance if we record marital status as 1, 2, 3, or 4 as stated above, we cannot write
4 > 2 or 3 < 4 and we cannot write 3 – 1 = 4 – 2, 1 + 3 = 4 or 4 ÷ 2 = 2.

In those situations when we cannot do anything except set up inequalities, we refer to the data as
ordinal data. For instance, if one mineral can scratch another, it receives a higher hardness number
and on Mohs’ scale the numbers from 1 to 10 are assigned respectively to talc, gypsum, calcite,
fluorite, apatite, feldspar, quartz, topaz, sapphire and diamond. With these numbers we can write
5 > 2 or 6 < 9 as apatite is harder than gypsum and feldspar is softer than sapphire, but we cannot
write for example 10 – 9 = 5 – 4, because the difference in hardness between diamond and sapphire
is actually much greater than that between apatite and fluorite. It would also be meaningless to say
that topaz is twice as hard as fluorite simply because their respective hardness numbers on Mohs’
scale are 8 and 4. The greater than symbol (i.e., >) in connection with ordinal data may be used to
designate “happier than” “preferred to” and so on.

When in addition to setting up inequalities we can also form differences, we refer to the data as
interval data. Suppose we are given the following temperature readings (in degrees Fahrenheit):
58°, 63°, 70°, 95°, 110°, 126° and 135°. In this case, we can write 100° > 70° or 95° < 135° which
simply means that 110° is warmer than 70° and that 95° is cooler than 135°. We can also write for
example 95° – 70° = 135° – 110°, since equal temperature differences are equal in the sense that the
same amount of heat is required to raise the temperature of an object from 70° to 95° or from 110°
to 135°. On the other hand, it would not mean much if we said that 126° is twice as hot as 63°, even
though 126° ÷ 63° = 2. To show the reason, we have only to change to the centigrade scale, where
the first temperature becomes 5/9 (126 – 32) = 52°, the second temperature becomes 5/9 (63 –
32) = 17° and the first figure is now more than three times the second. This difficulty arises from the
fact that Fahrenheit and Centigrade scales both have artificial origins (zeros) i.e., the number 0 of
neither scale is indicative of the absence of whatever quantity we are trying to measure.

When in addition to setting up inequalities and forming differences we can also form quotients
(i.e., when we can perform all the customary operations of mathematics), we refer to such data as
ratio data. In this sense, ratio data includes all the usual measurement (or determinations) of length,
height, money amounts, weight, volume, area, pressures etc.

The above stated distinction between nominal, ordinal, interval and ratio data is important for the
nature of a set of data may suggest the use of particular statistical techniques*. A researcher has to
be quite alert about this aspect while measuring properties of objects or of abstract concepts.

* When data can be measured in units which are interchangeable e.g., weights (by ratio scales), temperatures (by interval
scales), that data is said to be parametric and can be subjected to most kinds of statistical and mathematical processes. But
when data is measured in units which are not interchangeable, e.g., product preferences (by ordinal scales), the data is said
to be non-parametric and is susceptible only to a limited extent to mathematical and statistical treatment.

Measurement and Scaling Techniques 71

MEASUREMENT SCALES

From what has been stated above, we can write that scales of measurement can be considered in
terms of their mathematical properties. The most widely used classification of measurement scales
are: (a) nominal scale; (b) ordinal scale; (c) interval scale; and (d) ratio scale.

(a) Nominal scale: Nominal scale is simply a system of assigning number symbols to events in
order to label them. The usual example of this is the assignment of numbers of basketball players in
order to identify them. Such numbers cannot be considered to be associated with an ordered scale
for their order is of no consequence; the numbers are just convenient labels for the particular class of
events and as such have no quantitative value. Nominal scales provide convenient ways of keeping
track of people, objects and events. One cannot do much with the numbers involved. For example,
one cannot usefully average the numbers on the back of a group of football players and come up with
a meaningful value. Neither can one usefully compare the numbers assigned to one group with the
numbers assigned to another. The counting of members in each group is the only possible arithmetic
operation when a nominal scale is employed. Accordingly, we are restricted to use mode as the
measure of central tendency. There is no generally used measure of dispersion for nominal scales.
Chi-square test is the most common test of statistical significance that can be utilized, and for the
measures of correlation, the contingency coefficient can be worked out.

Nominal scale is the least powerful level of measurement. It indicates no order or distance
relationship and has no arithmetic origin. A nominal scale simply describes differences between
things by assigning them to categories. Nominal data are, thus, counted data. The scale wastes any
information that we may have about varying degrees of attitude, skills, understandings, etc. In spite
of all this, nominal scales are still very useful and are widely used in surveys and other ex-post-facto
research when data are being classified by major sub-groups of the population.

(b) Ordinal scale: The lowest level of the ordered scale that is commonly used is the ordinal scale.
The ordinal scale places events in order, but there is no attempt to make the intervals of the scale
equal in terms of some rule. Rank orders represent ordinal scales and are frequently used in research
relating to qualitative phenomena. A student’s rank in his graduation class involves the use of an
ordinal scale. One has to be very careful in making statement about scores based on ordinal scales.
For instance, if Ram’s position in his class is 10 and Mohan’s position is 40, it cannot be said that
Ram’s position is four times as good as that of Mohan. The statement would make no sense at all.
Ordinal scales only permit the ranking of items from highest to lowest. Ordinal measures have no
absolute values, and the real differences between adjacent ranks may not be equal. All that can be
said is that one person is higher or lower on the scale than another, but more precise comparisons
cannot be made.

Thus, the use of an ordinal scale implies a statement of ‘greater than’ or ‘less than’ (an equality
statement is also acceptable) without our being able to state how much greater or less. The real
difference between ranks 1 and 2 may be more or less than the difference between ranks 5 and 6.
Since the numbers of this scale have only a rank meaning, the appropriate measure of central tendency
is the median. A percentile or quartile measure is used for measuring dispersion. Correlations are
restricted to various rank order methods. Measures of statistical significance are restricted to the
non-parametric methods.

(c) Interval scale: In the case of interval scale, the intervals are adjusted in terms of some rule that
has been established as a basis for making the units equal. The units are equal only in so far as one

7 2 Research Methodology

accepts the assumptions on which the rule is based. Interval scales can have an arbitrary zero, but it
is not possible to determine for them what may be called an absolute zero or the unique origin. The
primary limitation of the interval scale is the lack of a true zero; it does not have the capacity to
measure the complete absence of a trait or characteristic. The Fahrenheit scale is an example of an
interval scale and shows similarities in what one can and cannot do with it. One can say that an
increase in temperature from 30° to 40° involves the same increase in temperature as an increase
from 60° to 70°, but one cannot say that the temperature of 60° is twice as warm as the temperature
of 30° because both numbers are dependent on the fact that the zero on the scale is set arbitrarily at
the temperature of the freezing point of water. The ratio of the two temperatures, 30° and 60°,
means nothing because zero is an arbitrary point.

Interval scales provide more powerful measurement than ordinal scales for interval scale also
incorporates the concept of equality of interval. As such more powerful statistical measures can be
used with interval scales. Mean is the appropriate measure of central tendency, while standard
deviation is the most widely used measure of dispersion. Product moment correlation techniques are
appropriate and the generally used tests for statistical significance are the ‘t’ test and ‘F’ test.
(d) Ratio scale: Ratio scales have an absolute or true zero of measurement. The term ‘absolute
zero’ is not as precise as it was once believed to be. We can conceive of an absolute zero of length
and similarly we can conceive of an absolute zero of time. For example, the zero point on a centimeter
scale indicates the complete absence of length or height. But an absolute zero of temperature is
theoretically unobtainable and it remains a concept existing only in the scientist’s mind. The number
of minor traffic-rule violations and the number of incorrect letters in a page of type script represent
scores on ratio scales. Both these scales have absolute zeros and as such all minor traffic violations
and all typing errors can be assumed to be equal in significance. With ratio scales involved one can
make statements like “Jyoti’s” typing performance was twice as good as that of “Reetu.” The ratio
involved does have significance and facilitates a kind of comparison which is not possible in case of
an interval scale.

Ratio scale represents the actual amounts of variables. Measures of physical dimensions such as
weight, height, distance, etc. are examples. Generally, all statistical techniques are usable with ratio
scales and all manipulations that one can carry out with real numbers can also be carried out with
ratio scale values. Multiplication and division can be used with this scale but not with other scales
mentioned above. Geometric and harmonic means can be used as measures of central tendency and
coefficients of variation may also be calculated.

Thus, proceeding from the nominal scale (the least precise type of scale) to ratio scale (the most
precise), relevant information is obtained increasingly. If the nature of the variables permits, the
researcher should use the scale that provides the most precise description. Researchers in physical
sciences have the advantage to describe variables in ratio scale form but the behavioural sciences
are generally limited to describe variables in interval scale form, a less precise type of measurement.

Sources of Error in Measurement

Measurement should be precise and unambiguous in an ideal research study. This objective, however,
is often not met with in entirety. As such the researcher must be aware about the sources of error in
measurement. The following are the possible sources of error in measurement.

Measurement and Scaling Techniques 73

(a) Respondent: At times the respondent may be reluctant to express strong negative feelings or
it is just possible that he may have very little knowledge but may not admit his ignorance. All this
reluctance is likely to result in an interview of ‘guesses.’ Transient factors like fatigue, boredom,
anxiety, etc. may limit the ability of the respondent to respond accurately and fully.

(b) Situation: Situational factors may also come in the way of correct measurement. Any condition
which places a strain on interview can have serious effects on the interviewer-respondent rapport.
For instance, if someone else is present, he can distort responses by joining in or merely by being
present. If the respondent feels that anonymity is not assured, he may be reluctant to express certain
feelings.

(c) Measurer: The interviewer can distort responses by rewording or reordering questions. His
behaviour, style and looks may encourage or discourage certain replies from respondents. Careless
mechanical processing may distort the findings. Errors may also creep in because of incorrect coding,
faulty tabulation and/or statistical calculations, particularly in the data-analysis stage.

(d) Instrument: Error may arise because of the defective measuring instrument. The use of complex
words, beyond the comprehension of the respondent, ambiguous meanings, poor printing, inadequate
space for replies, response choice omissions, etc. are a few things that make the measuring instrument
defective and may result in measurement errors. Another type of instrument deficiency is the poor
sampling of the universe of items of concern.

Researcher must know that correct measurement depends on successfully meeting all of the
problems listed above. He must, to the extent possible, try to eliminate, neutralize or otherwise deal
with all the possible sources of error so that the final results may not be contaminated.

Tests of Sound Measurement

Sound measurement must meet the tests of validity, reliability and practicality. In fact, these are the
three major considerations one should use in evaluating a measurement tool. “Validity refers to the
extent to which a test measures what we actually wish to measure. Reliability has to do with the
accuracy and precision of a measurement procedure ... Practicality is concerned with a wide range
of factors of economy, convenience, and interpretability ...”1 We briefly take up the relevant details
concerning these tests of sound measurement.

1. Test of Validity*

Validity is the most critical criterion and indicates the degree to which an instrument measures what
it is supposed to measure. Validity can also be thought of as utility. In other words, validity is the
extent to which differences found with a measuring instrument reflect true differences among those
being tested. But the question arises: how can one determine validity without direct confirming
knowledge? The answer may be that we seek other relevant evidence that confirms the answers we
have found with our measuring tool. What is relevant, evidence often depends upon the nature of the

1 Robert L. Thorndike and Elizabeth Hagen: Measurement and Evaluation in Psychology and Education, 3rd Ed., p. 162.
* Two forms of validity are usually mentioned in research literature viz., the external validity and the internal validity.
External validity of research findings is their generalizability to populations, settings, treatment variables and measurement
variables. We shall talk about it in the context of significance tests later on. The internal validity of a research design is its
ability to measure what it aims to measure. We shall deal with this validity only in the present chapter.

7 4 Research Methodology

research problem and the judgement of the researcher. But one can certainly consider three types of
validity in this connection: (i) Content validity; (ii) Criterion-related validity and (iii) Construct validity.

(i) Content validity is the extent to which a measuring instrument provides adequate coverage of
the topic under study. If the instrument contains a representative sample of the universe, the content
validity is good. Its determination is primarily judgemental and intuitive. It can also be determined by
using a panel of persons who shall judge how well the measuring instrument meets the standards, but
there is no numerical way to express it.
(ii) Criterion-related validity relates to our ability to predict some outcome or estimate the existence
of some current condition. This form of validity reflects the success of measures used for some
empirical estimating purpose. The concerned criterion must possess the following qualities:

Relevance: (A criterion is relevant if it is defined in terms we judge to be the proper measure.)
Freedom from bias: (Freedom from bias is attained when the criterion gives each subject an equal
opportunity to score well.)
Reliability: (A reliable criterion is stable or reproducible.)
Availability: (The information specified by the criterion must be available.)

In fact, a Criterion-related validity is a broad term that actually refers to (i) Predictive validity
and (ii) Concurrent validity. The former refers to the usefulness of a test in predicting some future
performance whereas the latter refers to the usefulness of a test in closely relating to other measures
of known validity. Criterion-related validity is expressed as the coefficient of correlation between
test scores and some measure of future performance or between test scores and scores on another
measure of known validity.
(iii) Construct validity is the most complex and abstract. A measure is said to possess construct
validity to the degree that it confirms to predicted correlations with other theoretical propositions.
Construct validity is the degree to which scores on a test can be accounted for by the explanatory
constructs of a sound theory. For determining construct validity, we associate a set of other propositions
with the results received from using our measurement instrument. If measurements on our devised
scale correlate in a predicted way with these other propositions, we can conclude that there is some
construct validity.

If the above stated criteria and tests are met with, we may state that our measuring instrument
is valid and will result in correct measurement; otherwise we shall have to look for more information
and/or resort to exercise of judgement.

2. Test of Reliability

The test of reliability is another important test of sound measurement. A measuring instrument is
reliable if it provides consistent results. Reliable measuring instrument does contribute to validity, but
a reliable instrument need not be a valid instrument. For instance, a scale that consistently overweighs
objects by five kgs., is a reliable scale, but it does not give a valid measure of weight. But the other
way is not true i.e., a valid instrument is always reliable. Accordingly reliability is not as valuable as
validity, but it is easier to assess reliability in comparison to validity. If the quality of reliability is
satisfied by an instrument, then while using it we can be confident that the transient and situational
factors are not interfering.

Measurement and Scaling Techniques 75

Two aspects of reliability viz., stability and equivalence deserve special mention. The stability
aspect is concerned with securing consistent results with repeated measurements of the same person
and with the same instrument. We usually determine the degree of stability by comparing the results
of repeated measurements. The equivalence aspect considers how much error may get introduced
by different investigators or different samples of the items being studied. A good way to test for the
equivalence of measurements by two investigators is to compare their observations of the same
events. Reliability can be improved in the following two ways:

(i) By standardising the conditions under which the measurement takes place i.e., we must
ensure that external sources of variation such as boredom, fatigue, etc., are minimised to
the extent possible. That will improve stability aspect.

(ii) By carefully designed directions for measurement with no variation from group to group,
by using trained and motivated persons to conduct the research and also by broadening the
sample of items used. This will improve equivalence aspect.

3. Test of Practicality

The practicality characteristic of a measuring instrument can be judged in terms of economy,
convenience and interpretability. From the operational point of view, the measuring instrument ought
to be practical i.e., it should be economical, convenient and interpretable. Economy consideration
suggests that some trade-off is needed between the ideal research project and that which the budget
can afford. The length of measuring instrument is an important area where economic pressures are
quickly felt. Although more items give greater reliability as stated earlier, but in the interest of limiting
the interview or observation time, we have to take only few items for our study purpose. Similarly,
data-collection methods to be used are also dependent at times upon economic factors. Convenience
test suggests that the measuring instrument should be easy to administer. For this purpose one should
give due attention to the proper layout of the measuring instrument. For instance, a questionnaire,
with clear instructions (illustrated by examples), is certainly more effective and easier to complete
than one which lacks these features. Interpretability consideration is specially important when
persons other than the designers of the test are to interpret the results. The measuring instrument, in
order to be interpretable, must be supplemented by (a) detailed instructions for administering the test;
(b) scoring keys; (c) evidence about the reliability and (d) guides for using the test and for interpreting
results.

TECHNIQUE OF DEVELOPING MEASUREMENT TOOLS

The technique of developing measurement tools involves a four-stage process, consisting of the
following:

(a) Concept development;
(b) Specification of concept dimensions;
(c) Selection of indicators; and
(d) Formation of index.

The first and foremost step is that of concept development which means that the researcher
should arrive at an understanding of the major concepts pertaining to his study. This step of concept

7 6 Research Methodology

development is more apparent in theoretical studies than in the more pragmatic research, where the
fundamental concepts are often already established.

The second step requires the researcher to specify the dimensions of the concepts that he
developed in the first stage. This task may either be accomplished by deduction i.e., by adopting a
more or less intuitive approach or by empirical correlation of the individual dimensions with the total
concept and/or the other concepts. For instance, one may think of several dimensions such as product
reputation, customer treatment, corporate leadership, concern for individuals, sense of social
responsibility and so forth when one is thinking about the image of a certain company.

Once the dimensions of a concept have been specified, the researcher must develop indicators
for measuring each concept element. Indicators are specific questions, scales, or other devices by
which respondent’s knowledge, opinion, expectation, etc., are measured. As there is seldom a perfect
measure of a concept, the researcher should consider several alternatives for the purpose. The use
of more than one indicator gives stability to the scores and it also improves their validity.

The last step is that of combining the various indicators into an index, i.e., formation of an
index. When we have several dimensions of a concept or different measurements of a dimension,
we may need to combine them into a single index. One simple way for getting an overall index is to
provide scale values to the responses and then sum up the corresponding scores. Such an overall
index would provide a better measurement tool than a single indicator because of the fact that an
“individual indicator has only a probability relation to what we really want to know.”2 This way we
must obtain an overall index for the various concepts concerning the research study.

Scaling

In research we quite often face measurement problem (since we want a valid measurement but may
not obtain it), specially when the concepts to be measured are complex and abstract and we do not
possess the standardised measurement tools. Alternatively, we can say that while measuring attitudes
and opinions, we face the problem of their valid measurement. Similar problem may be faced by a
researcher, of course in a lesser degree, while measuring physical or institutional concepts. As such
we should study some procedures which may enable us to measure abstract concepts more accurately.
This brings us to the study of scaling techniques.

Meaning of Scaling

Scaling describes the procedures of assigning numbers to various degrees of opinion, attitude and
other concepts. This can be done in two ways viz., (i) making a judgement about some characteristic
of an individual and then placing him directly on a scale that has been defined in terms of that
characteristic and (ii) constructing questionnaires in such a way that the score of individual’s responses
assigns him a place on a scale. It may be stated here that a scale is a continuum, consisting of the
highest point (in terms of some characteristic e.g., preference, favourableness, etc.) and the lowest
point along with several intermediate points between these two extreme points. These scale-point
positions are so related to each other that when the first point happens to be the highest point, the
second point indicates a higher degree in terms of a given characteristic as compared to the third

2 Lazersfeld, Evidence and Inference, p. 112.

Measurement and Scaling Techniques 77

point and the third point indicates a higher degree as compared to the fourth and so on. Numbers for
measuring the distinctions of degree in the attitudes/opinions are, thus, assigned to individuals
corresponding to their scale-positions. All this is better understood when we talk about scaling
technique(s). Hence the term ‘scaling’ is applied to the procedures for attempting to determine
quantitative measures of subjective abstract concepts. Scaling has been defined as a “procedure for
the assignment of numbers (or other symbols) to a property of objects in order to impart some of the
characteristics of numbers to the properties in question.”3

Scale Classification Bases

The number assigning procedures or the scaling procedures may be broadly classified on one or
more of the following bases: (a) subject orientation; (b) response form; (c) degree of subjectivity;
(d) scale properties; (e) number of dimensions and (f) scale construction techniques. We take up
each of these separately.

(a) Subject orientation: Under it a scale may be designed to measure characteristics of the respondent
who completes it or to judge the stimulus object which is presented to the respondent. In respect of
the former, we presume that the stimuli presented are sufficiently homogeneous so that the between-
stimuli variation is small as compared to the variation among respondents. In the latter approach, we
ask the respondent to judge some specific object in terms of one or more dimensions and we presume
that the between-respondent variation will be small as compared to the variation among the different
stimuli presented to respondents for judging.

(b) Response form: Under this we may classify the scales as categorical and comparative.
Categorical scales are also known as rating scales. These scales are used when a respondent scores
some object without direct reference to other objects. Under comparative scales, which are also
known as ranking scales, the respondent is asked to compare two or more objects. In this sense the
respondent may state that one object is superior to the other or that three models of pen rank in order
1, 2 and 3. The essence of ranking is, in fact, a relative comparison of a certain property of two or
more objects.

(c) Degree of subjectivity: With this basis the scale data may be based on whether we measure
subjective personal preferences or simply make non-preference judgements. In the former case, the
respondent is asked to choose which person he favours or which solution he would like to see
employed, whereas in the latter case he is simply asked to judge which person is more effective in
some aspect or which solution will take fewer resources without reflecting any personal preference.

(d) Scale properties: Considering scale properties, one may classify the scales as nominal, ordinal,
interval and ratio scales. Nominal scales merely classify without indicating order, distance or unique
origin. Ordinal scales indicate magnitude relationships of ‘more than’ or ‘less than’, but indicate no
distance or unique origin. Interval scales have both order and distance values, but no unique origin.
Ratio scales possess all these features.

(e) Number of dimensions: In respect of this basis, scales can be classified as ‘unidimensional’
and ‘multidimensional’ scales. Under the former we measure only one attribute of the respondent or
object, whereas multidimensional scaling recognizes that an object might be described better by using
the concept of an attribute space of ‘n’ dimensions, rather than a single-dimension continuum.

3 Bernard S. Phillips, Social Research Strategy and Tactics, 2nd ed., p. 205.

7 8 Research Methodology

(f) Scale construction techniques: Following are the five main techniques by which scales can
be developed.

(i) Arbitrary approach: It is an approach where scale is developed on ad hoc basis. This is
the most widely used approach. It is presumed that such scales measure the concepts for
which they have been designed, although there is little evidence to support such an assumption.

(ii) Consensus approach: Here a panel of judges evaluate the items chosen for inclusion in
the instrument in terms of whether they are relevant to the topic area and unambiguous in
implication.

(iii) Item analysis approach: Under it a number of individual items are developed into a test
which is given to a group of respondents. After administering the test, the total scores are
calculated for every one. Individual items are then analysed to determine which items
discriminate between persons or objects with high total scores and those with low scores.

(iv) Cumulative scales are chosen on the basis of their conforming to some ranking of items
with ascending and descending discriminating power. For instance, in such a scale the
endorsement of an item representing an extreme position should also result in the
endorsement of all items indicating a less extreme position.

(v) Factor scales may be constructed on the basis of intercorrelations of items which indicate
that a common factor accounts for the relationship between items. This relationship is
typically measured through factor analysis method.

Important Scaling Techniques

We now take up some of the important scaling techniques often used in the context of research
specially in context of social or business research.

Rating scales: The rating scale involves qualitative description of a limited number of aspects of a
thing or of traits of a person. When we use rating scales (or categorical scales), we judge an object
in absolute terms against some specified criteria i.e., we judge properties of objects without reference
to other similar objects. These ratings may be in such forms as “like-dislike”, “above average, average,
below average”, or other classifications with more categories such as “like very much—like some
what—neutral—dislike somewhat—dislike very much”; “excellent—good—average—below
average—poor”, “always—often—occasionally—rarely—never”, and so on. There is no specific
rule whether to use a two-points scale, three-points scale or scale with still more points. In practice,
three to seven points scales are generally used for the simple reason that more points on a scale
provide an opportunity for greater sensitivity of measurement.

Rating scale may be either a graphic rating scale or an itemized rating scale.

(i) The graphic rating scale is quite simple and is commonly used in practice. Under it the
various points are usually put along the line to form a continuum and the rater indicates his
rating by simply making a mark (such as ü) at the appropriate point on a line that runs from
one extreme to the other. Scale-points with brief descriptions may be indicated along the
line, their function being to assist the rater in performing his job. The following is an example
of five-points graphic rating scale when we wish to ascertain people’s liking or disliking any
product:

Measurement and Scaling Techniques 79

How do you like the product?
(Please check)

Like very Like some Neutral Dislike some Dislike very

much what what much

Fig. 5.1

This type of scale has several limitations. The respondents may check at almost any
position along the line which fact may increase the difficulty of analysis. The meanings of
the terms like “very much” and “some what” may depend upon respondent’s frame of
reference so much so that the statement might be challenged in terms of its equivalency.
Several other rating scale variants (e.g., boxes replacing line) may also be used.

(ii) The itemized rating scale (also known as numerical scale) presents a series of statements
from which a respondent selects one as best reflecting his evaluation. These statements
are ordered progressively in terms of more or less of some property. An example of itemized
scale can be given to illustrate it.

Suppose we wish to inquire as to how well does a worker get along with his fellow workers? In
such a situation we may ask the respondent to select one, to express his opinion, from the following:

n He is almost always involved in some friction with a fellow worker.

n He is often at odds with one or more of his fellow workers.

n He sometimes gets involved in friction.

n He infrequently becomes involved in friction with others.

n He almost never gets involved in friction with fellow workers.

The chief merit of this type of scale is that it provides more information and meaning to the rater,
and thereby increases reliability. This form is relatively difficult to develop and the statements may
not say exactly what the respondent would like to express.

Rating scales have certain good points. The results obtained from their use compare favourably
with alternative methods. They require less time, are interesting to use and have a wide range of
applications. Besides, they may also be used with a large number of properties or variables. But their
value for measurement purposes depends upon the assumption that the respondents can and do
make good judgements. If the respondents are not very careful while rating, errors may occur. Three
types of errors are common viz., the error of leniency, the error of central tendency and the error of
hallo effect. The error of leniency occurs when certain respondents are either easy raters or hard
raters. When raters are reluctant to give extreme judgements, the result is the error of central
tendency. The error of hallo effect or the systematic bias occurs when the rater carries over a
generalised impression of the subject from one rating to another. This sort of error takes place when
we conclude for example, that a particular report is good because we like its form or that someone is
intelligent because he agrees with us or has a pleasing personality. In other words, hallo effect is
likely to appear when the rater is asked to rate many factors, on a number of which he has no
evidence for judgement.

8 0 Research Methodology

Ranking scales: Under ranking scales (or comparative scales) we make relative judgements
against other similar objects. The respondents under this method directly compare two or more
objects and make choices among them. There are two generally used approaches of ranking scales viz.

(a) Method of paired comparisons: Under it the respondent can express his attitude by making a
choice between two objects, say between a new flavour of soft drink and an established brand of
drink. But when there are more than two stimuli to judge, the number of judgements required in a
paired comparison is given by the formula:

b gN = n n − 1
2

where N = number of judgements

n = number of stimuli or objects to be judged.

For instance, if there are ten suggestions for bargaining proposals available to a workers union, there
are 45 paired comparisons that can be made with them. When N happens to be a big figure, there is
the risk of respondents giving ill considered answers or they may even refuse to answer. We can
reduce the number of comparisons per respondent either by presenting to each one of them only a
sample of stimuli or by choosing a few objects which cover the range of attractiveness at about equal
intervals and then comparing all other stimuli to these few standard objects. Thus, paired-comparison
data may be treated in several ways. If there is substantial consistency, we will find that if X is
preferred to Y, and Y to Z, then X will consistently be preferred to Z. If this is true, we may take the
total number of preferences among the comparisons as the score for that stimulus.

It should be remembered that paired comparison provides ordinal data, but the same may be
converted into an interval scale by the method of the Law of Comparative Judgement developed by
L.L. Thurstone. This technique involves the conversion of frequencies of preferences into a table of
proportions which are then transformed into Z matrix by referring to the table of area under the
normal curve. J.P. Guilford in his book “Psychometric Methods” has given a procedure which is
relatively easier. The method is known as the Composite Standard Method and can be illustrated as
under:

Suppose there are four proposals which some union bargaining committee is considering. The
committee wants to know how the union membership ranks these proposals. For this purpose
a sample of 100 members might express the views as shown in the following table:

Table 5.1: Response Patterns of 100 Members’ Paired Comparisons of
4 Suggestions for Union Bargaining Proposal Priorities

Suggestion

A B CD

A – 65* 32 20
B 40 – 38 42
C 45 50 – 70
D 80 20 98 –

TOTAL: 165 135 168 132

*Read as 65 members preferred suggestion B to suggestion A. Contd.

Measurement and Scaling Techniques 81

Rank order 2 3 1 4
0.5375 0.4625 0.5450 0.4550
Mp 0.09 (–).09 0.11 (–).11
Zj 0.20 0.02 0.22 0.00
Rj

Comparing the total number of preferences for each of the four proposals, we find that C is the
most popular, followed by A, B and D respectively in popularity. The rank order shown in the above

table explains all this.

By following the composite standard method, we can develop an interval scale from the paired-
comparison ordinal data given in the above table for which purpose we have to adopt the following

steps in order:

(i) Using the data in the above table, we work out the column mean with the help of the

formula given below:

b g b gC + .5 N 165 + .5 100
= = .5375
4 100
b gM p = nN

where

Mp = the mean proportion of the columns
C = the total number of choices for a given suggestion

n = number of stimuli (proposals in the given problem)

N = number of items in the sample.

The column means have been shown in the Mp row in the above table.
(ii) The Z values for the Mp are secured from the table giving the area under the normal curve.

When the Mp value is less than .5, the Z value is negative and for all Mp values higher than
.5, the Z values are positive.* These Z values are shown in Zj row in the above table.
(iii) As the Z values represent an interval scale, zero is an arbitrary value. Hence we can

j

eliminate negative scale values by giving the value of zero to the lowest scale value (this
being (–).11 in our example which we shall take equal to zero) and then adding the absolute
value of this lowest scale value to all other scale items. This scale has been shown in R

j

row in the above table.

Graphically we can show this interval scale that we have derived from the paired-comparison
data using the composite standard method as follows:

D B AC

0.0 0.1 0.2 0.3 0.4

Fig. 5.2

* To use Normal curve area table for this sort of transformation, we must subtract 0.5 from all M values which exceed
p

.5 to secure the values with which to enter the normal curve area table for which Z values can be obtained. For all Mp values
of less than . 5 we must subtract all such values from 0.5 to secure the values with which to enter the normal curve area table
for which Z values can be obtained but the Z values in this situation will be with negative sign.

8 2 Research Methodology

(b) Method of rank order: Under this method of comparative scaling, the respondents are asked
to rank their choices. This method is easier and faster than the method of paired comparisons stated
above. For example, with 10 items it takes 45 pair comparisons to complete the task, whereas the
method of rank order simply requires ranking of 10 items only. The problem of transitivity (such as A
prefers to B, B to C, but C prefers to A) is also not there in case we adopt method of rank order.
Moreover, a complete ranking at times is not needed in which case the respondents may be asked to
rank only their first, say, four choices while the number of overall items involved may be more than
four, say, it may be 15 or 20 or more. To secure a simple ranking of all items involved we simply total
rank values received by each item. There are methods through which we can as well develop an
interval scale of these data. But then there are limitations of this method. The first one is that data
obtained through this method are ordinal data and hence rank ordering is an ordinal scale with all its
limitations. Then there may be the problem of respondents becoming careless in assigning ranks
particularly when there are many (usually more than 10) items.

Scale Construction Techniques

In social science studies, while measuring attitudes of the people we generally follow the technique
of preparing the opinionnaire* (or attitude scale) in such a way that the score of the individual
responses assigns him a place on a scale. Under this approach, the respondent expresses his
agreement or disagreement with a number of statements relevant to the issue. While developing
such statements, the researcher must note the following two points:

(i) That the statements must elicit responses which are psychologically related to the attitude
being measured;

(ii) That the statements need be such that they discriminate not merely between extremes of
attitude but also among individuals who differ slightly.

Researchers must as well be aware that inferring attitude from what has been recorded in
opinionnaires has several limitations. People may conceal their attitudes and express socially acceptable
opinions. They may not really know how they feel about a social issue. People may be unaware of
their attitude about an abstract situation; until confronted with a real situation, they may be unable to
predict their reaction. Even behaviour itself is at times not a true indication of attitude. For instance,
when politicians kiss babies, their behaviour may not be a true expression of affection toward infants.
Thus, there is no sure method of measuring attitude; we only try to measure the expressed opinion
and then draw inferences from it about people’s real feelings or attitudes.

With all these limitations in mind, psychologists and sociologists have developed several scale
construction techniques for the purpose. The researcher should know these techniques so as to
develop an appropriate scale for his own study. Some of the important approaches, along with the
corresponding scales developed under each approach to measure attitude are as follows:

* An information form that attempts to measure the attitude or belief of an individual is known as opinionnaire.


Click to View FlipBook Version