The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.
Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by tsegayegeze, 2016-05-17 07:57:39

Practical Statistitcs for The Textile Industry Part I

~L£MS FOR CHAPTER 5 7. The distribution of the masses of a certain ty~e
of g,rment is known to be approximately Normal with
1 ,\"cry large baech of items is submitted for a st,ndard deviation of 6 g. It is requir'd that not
i~spection. The proportlon,of defectlves ln the , more than 1% of the garments produced should have a
batch is p, and itCffiSare lnspected at random untll mass less than 213 g. What should be the average mass
of tlie gannents, correct to the nearest gram?
ctly r defectives have been found. Show that the Assu,' ing the garments are made wi th thi s average mass,
~~~bability that n itEillswill then have been what ;Jroportion of garments will have massl~s betwEen
218 rand 233 g?
inspected 1 S 3. ,'\ 'llJSS of 1509 W(lS hung in turn from 2~iO pieces
of a cenain yarn, and 12 of the pieces brflke, A load
2. t, company m,]kes raincoats and claims that 95:" of of 20U g was tlien hung from the previously "nbroken
it~ COJts will pass a standard waterproof test. pie(('s and a furth~ 213 piece'; broke. Assuming that
yarn strength is distributed Normally, estimate the
Fi fty coats are chosen at random frSlllthe production mean and standard deviation of the yarn's ~trength.
line, A,,;wming that the firm's claim is correct, find 9. The weekly orders for a certain garment have a
Normal distribution, with a mean of 820 dozen and a
(3) the average number of coats that would be standard deviation of 50 dozen. At the start of each
expected to fall; wee', a number of garments are placed in store to
(b! the probability that mare than one coat will meet the week's orders.
(a) How many dozen should be in the store at the
fall ~he test.~ start of a week in order to be 97% certain of being
(c) the probability that six or more coats will able IO meet the week's orders?
(b) If the maximum capacity of the store is 880
fail the test, dozen, and this capacity is used to the full, what
is the probability of failing to meet all the week's
If le) actually happened, what would you conclude order~?
about the c~~pany's c'alm? G,ve the reason for your 10. A company makes knitted garments, and any garment
cone 1 uSl on. of mass less than 250 g is rejected. It is known that
when a mass of x grams of yarn is used in the
3. In a certain factory, the machines tend to break knitting process, the resulting finished garments
do;.;na t random, and the average number of breakdowns have masses that follow a Normal distribution with a
per day is 2.1. The mechanic can deal with up to me"n cf 0.9x grams and a standard deviation of 10
three brea~downs during a normal working day, If grams.
m re than three machines break down, the mechanic has Suppose P (x) is the proportion of acceptable
to work overtime. garments when a mass x is used at knitting. For
x = 290,295, ... ,300, calculate P(x). The cost of
(a) What is the probabi1 ity that tee mechanic will yarn is 2p per g-am, and the cost of knitting and
have to work overtime on a particular day? overhe,ds is £J per garment. If the cost of
producing rejected garments is borne'equally by all
(b) ',hat is the most probable number of stoppages accept~b1e garments, show that the average cost of
in a day? an acc~ptable garment when x grams of yarn are used
at knitting is
(c) If, on a certain day, six machines broke down,
what would you conc1ude, and why? Hence find the value of x (correct to the nearest
5 g) that minimizes this cost.
4. A roll of fabric contains, on average, eight
defects scattered at random in 100 m'. Pieces of c_
fabric,S m by 2 m, are cut from the roll. What is
the probability that

(a) a piece will contain n0 defects;

(b) piece will contain more than ene defect;

(c) a piece will contain fewer than two defects;

(d) five pieces selected at random will all be free
from defec ts.

5. A manufacturer sells articles in lots of 100 and

agrees to pay the purchaser a penalty of no if five

or more articles in a lot are defective. He knows
from pest experience that, on average, 2% of the
articles are defective. It costs £2 to make each
article, What should the selling price be if the
manufacturer's over-all profit is to be 25%?

6. Ao experiment is carried out to cempare two
softening a~ents A and B. The trea,~ents are applied
to a fabric, and an assessor then states which of A
and B he thinks is the more effective, Th,.
procedure is repeated with 20 different fabrics.

What would you conclude about the relative effective-
nes=, of A and B if

(a) A was preferred 12 time,,;and 8 was preferred 8
time-, ;

(b) A was pref~rred 15 times and was preferred
times?

L__ -

6.1 RepeJ ted Si,II;ijn.gl As stated at the end of Section 6.1, the general
solution to this problem is difficult, but at least
Suppose a technologist needs to estimate the mean
breaking strength of a consignment of yam. To do the mean ~z and the variance 01 of the z values can
this he carries out, say, 100 standard strength
tests and calculates their mean, i. The 100 tests fairly easily bc found. Suppose the distributions of
can be regarded as a sample chosen at random from
the population of strength tests that could be x" x, , xk have means Ul, u, ... , uk and variances
carried out on the consignment, and the technolo-
gist's real objective, of course, is to estimate the oL a~ , ok' Then it can be shown that
average strength of all the yarn in the consignment,
i.e., the population average. The intuitive thing to ~z =f(~l'~""~k)'
do would be to regard the sample ~verage, i, as the and that
best available single estimate of the population
mean, ~,and it can be shown that in this case where the partial derivatives are evaluated at
intuition is correct. But how good an estimate is X, = ~ll, x, = ~" ... , xk = ~k. These equationc are
it? Is it accurate? Is it precise? exactly true "hen the Function (6.1) is linear, i.e.,
of the fonn
These are questions that arise because experience
tells uS that, if the experiment were repeated, i.e., ~here the a's are constants, but they can also be
if another 100 strengt:, tests were carried out, the used. with sufficient accuracy for most practical
mean of this second sample is very unlikely to be purposes. for non-linear functions when the c)effi-
equal to the mean of the first sample, thus giving cients of variation of the x's are less than 15%.
US a different estimate of the population mean. In lxample 6.1 Consider the function of Equation (6.2),
other words, we realize that the averages of "hich can be written
repeated samples chosen from the same population will
vary. and we cannot begin to answer questions like wl~re k is the constant 4.44 x 10-'. Equation (6.3)
those posed at the end of the last paragraph until glves immediately
we know how they vary. What is needed is a knowledge
of the distribution of sample averages in repeated 1 ~. 1
sampling; this is called the sampling distribution of
the mean. -,x, )kX,'( 1 -i) = -kX,2

The calculation of the form of the sampling 2x,'
distribution of the mean is a special case of a
general basic problem in mathematical statistics, the Putting x, = ~l ,x, = \1, in these expressions and
solution of "hich can be difficult and is outside the using Equation (6.4) gives
scope of this book. However, the theory does provide
several results that are useful in practical applica- k' , k'~l ,
tions. and some of them "ill be given in the next
few sections. -- °1+-3- 0,

6.2 The Mean and Variance of a Function of 4Ul~2 4u,

Random Variables Dividing both sides of this equation by ~~, we find
that
The general problem referred to above can be stated
as follows. Suppose we are interested in a quantity
z, whose value can be found when the values of
several independent* random variables x"x" ...•xk
are known, i.e., z is a function of these variables:

For example, if x, is the linear denslty Of a yarn
(io tex) and x, the flbre denslty (io g/cm'), then
the yarn diameter z (in cm) can be estimated from the
equation

Now suppose an experiment consists in measuring Each
of the x's and calculating the value of z. If this
experimert is repeated many times, a sequence of z
,a1ues will be generated, which, in general, will
vary from repeat to repeat and which, therefore, will
have a distribution. This distribl tioo is called ~he
sampling distribution of z, and the general problem
is to calculate the form of the sampling distribution
when the distributions of the x's are given.

* Roughly speaking, random variables are independent
when the value of anyone of them is not influenced
by the values of the others.

cn ,C' ~(C;'
z

"nere Ci' C" and Cz are the coefficients of
variation of Xi ,I." and z, respect1Vely.

-6.3 SCII\eSpecial Ca~es cases of the function We then find aZ/3X,
and therefore
Ther. are several special

---6.3.1 Linear Function~

For functions of the fann

where the a's are constants, Equations (6.3) and
(6.4) are exactly true and give

Two important examples of linear functions are the
sum and difference c~ two variabl~s. For the sum

(a 1 a, 1), we find 6.3.3 Quotients
~z .. Ul +U2 If
and

Qz' ~ ai+o~

whi Ie for the difference
z = Xl-X!

(a i = a, -: ) we je"

"z tJl-P2

~ r=-============-------4-5-1."

---6. Sampling Distributions

H!_' h~lVC' ':>l'l:ll how lo \~::ilillldLL: the Hledl] dlld vdriJI1Ce The properties of the sampling distribution of x can
now be deduced from th[~ resul ts given in the last
of,j function of ~)l'vc)'dl !'dlldolll VLll'idllle:., but :iO two sl:ctions.
Lp' IlOUlill'j ilJS L)(.:cn ~.Jid dLJuul the X..Q ..:.::.!.!. of lhe (il) The mean of the sampl ing d'istribution of x 'is
given exactly (since we are dealing with J linear
distri:Jutioll of z. In gcner'dl, this can be difficult function) by putting al = a, = ... = an = l/n in
Equation (6.6). Thus
to find, but there is one general theorem that holds
but, since the x's all come from t 1e same population
for linear functions and is of great importance in fll::: ~j2 ::: ••• ::: 'lJn= f1) l.nd therefor::

statlstics. This is the central-limit theorem, which (b) The variance of the sampling distribution comes
from Equation (6.7) and is
states that the distribution of a llnear function of
=~
k independent random variables,
n'
will tend to be nonnal, almost irrespective of the
fonns of the distributions of the x's, provided that 02
none of the variances
n
is large compared with the sum of the others. The since of
larger the number of variables in the function, i.e.,
the 1reater k is, the closer \1ill be the approach to Therefore the standard deviation of the sampling
nonna 1 ity. dist~ipution of the mean 'is'

This theorem explains a number of results that have This standard deviation is often called the standard
error of the mean, but it shoulo be remembered that
been stated in previous chapters. The approach of it is simply a standard oeviation.

the binomial distribution to the normal distribution X(cl Since is a linear function of the x's, the

as the sample size increases, noted at the end of central-limit theorem can be applied, and this tells
us that the sampling distribution of the mean will
Chapter 5, is an example of the working of the tend to be normal. The larger the sample size, the
closer the approach to normality will be. However,
central-limit theorem. It also explains why many the tendency to normality is very rapid, provided
that the original distribution is not too skew, and
practical distributions seem to be fitted well 'by the means of' quite small samples, of the order'of 4
or 5, can often be treated as though they "ere
the nannal distribution. The reason is that many nonnally distributed. The situation is illustrated in
Fig. 6.1.
su~h measurements can be reg~r,ded as being.made up.

a I a 1 arge number of independent' components~" .

Consider, for example, the yarn-count data in Table

2.3, whose histogram is shown in Fig. 2.3 and was

shown in Section 5.7 to be well fitted by" nonnal

curve. The count measurements will vary because of

errors of measurement (i .e., errors in measuring.

lengths and masses of pieces of yarn), errors due

to samp 1 ing (i. e., they "i 11 depend on whi ch pi eces

of yaro are actually chosen to be measured),

variations in raw materials, etc., etc. In other

words, the values are affected by many independent

sources of error. Now, if each independent error is

relatively small, it is possible to show thot any

individual observation can be written in 'the',fonn'

where u is the'true'or exact value and e is an error
made up of a linear combination of the independent
errors, i.e.,

This is just the kind of linear function to which
the central-limit theorem refe~s, and thus we should
expect a nonnal distribution in these circumstances.
Note that this does not necessarily imply that there
is anything wrong with data that do not have a nonnal
distribution.

6.5 The Sampling Distribution of the Mean

We now return tO,the sampl'og distribution of the_
mean. with which this chapter was introduced.
Suppose the population from which the samples are
chosen has mean u and variance 02. The mean of a
sample of size n is given by

which shows that x is a linear function of the x's,
like that in Equation (6.5) with

a, = a, •... '.Jan = l/D.,

, The Y'_Distribution Now, remembering that a standard normal variable Q
6.° __ -" is obtained by using the transformation

~ 'mportant sampling distribution I<ith Illony we see that the terms on the right-h lfld side arl'
,,nothO:r 11 applications ans'eIs oS fal Ol<S. Suppose
ractlCa
;ndependent values

lJ!,U2' .. , , uk similar ';0 this with the unknown population mea" IJ

,re chosen at random frcm 0 stondard normal distribu- replace,; by the mean x calculated from the sample.
;.00. Denote thelr sum cf squares by '(~) i.e."*,
Hence the right-hand side of Equation (6.21) can be

re0ardec' as a sum of squares of standard normal
xvariables
and would therefore behave like 2


However, the terms are not all independent, since

they are connected by the fact that

jf thlS procedure is repea:ed many times, a sequence x ; (Xl+X2+ .•. +Xn)/n.
• x' va 1ues wi li be genera ted tha t wlll va ry among
o~ 's,lves and hence will have a distribution. This is This simply reduces the number of independent tenns,
~h:ms~mpling distribution of X', or simply the X' and he~ce the degrees of freedom, by one, so that
k ; n-1.
distribution.
Consequently, we have shown that the quantity
The number k of independent values of ~ forming X' is (n-l) S2(02 will have the same distrlbution as X6-1'
alled the number of degrees of freedom, and the a result that has a number of important appllcatl0ns.
~haPe of the distribution depends on this number.
IThis explains why the sl·fflx k has been added to XZ 1. The variable z is a function of the independent
in Equation (6.20).) In general, the distribution is random variables X, and x, of the form

positively skewed, as shown in Fig. 6.2, though for
large values of k the d'stribution does tend towards
the normal. Note that, because X' is a sum of squares
its value can never be less than zero. It ,can be
sho'wn that the mean and vanance of the .Xk

distribution are k and 2k, respectively.

For applications of X', we have to deal with areas

under the distribution, which, of course, represent

probabilities, Table A3 in the Appendix gives values

obfeyXokn'D,0X"ks,uoch that t1' h.ee, area in the right-hand tail
lS 0,

PdXk ) Xk, a) ; a . where (Ill ,0; ) are the mean and variance of x, , and
(~.,o~)are the mean and variance of x,.
The use of the tables will be described later.
2. A par~ of a garment is made by joining two pieces
The X' distribution allows us to find the distribution of fabric, A and S, end to end. The specified length
of the variance of samples selected from a normal of the camp 1eted pa rt is 60 em, with a tolerance of
population with variance 0', i.e., the samplirl9 ± 2 em.
distribution of the variance. l~e have defined the (a) It is known that the lengths of the A pieces are
varionce of a sample of size n as nonnally ,~istributed with mean 20 cm and standard
deviation 0.5 em. while the lengths of the S pieces
n are nonnally distributed with mean 40 em and
5' [(x,-x)'/(n-I), standard deviation 1 em. What proportion of the-
completed parts will meet the specified tolerar ces'?
i;1 1 (b) If, by imposing additional quality control, the
standard deviation of the B pieces can be reduced to
(n-l)s'; ~ (Xj-X)' ~ em, what value must ~ have if it is required that
i =1 c Z 99% of the completed parts must lie within the
02 toleranc€c?

'X'is usually written 'chi-square' and pronounced Fig. 6.2 A typical x2 distribution
I ki -square I.

2
X k, a

-~--------------------)~C __ --

7.1 Introduction Therefore s'. as defined by equation (7.1). is an
Sufficient foundations havc now been laid for U5 to unbiasscQ, estimator of 0'.
considel' one of the basic problems of statistics,
that of estimating the values of the parameters If the divisor n had been used, the mea,. of its
desc~ibing the population with which we are dealing. sampling distribution would have been equal to
Before any calculations can be made, however. it has
to be decided what type of distribution is concerned (.r1.::.!..)o'
and the discussion of the standard distributions in n
Chapter 5 should assist in this. The three
distributions we shall mostly be concerned with. 7.1.3 Precision of a Point Estimate
namely, the binomial. the Poisson. and the nonnal,
were described there. It was pointed out that ~he The precision of an estimator can be measured by the
binomial is canpletely defined when the sample size, standard deviation of its sampling distribution,
n. and the probabi 1ity of ' succ~ss. P. are known; because this measures how closely estimates from
the ooisson is defined byi ts mean u = m; and the repeated samples will fall about the population
nonnal distribution is comp;etely known when its parameter. Obviously, the smaller the standard
mean u. and its standard deviation 0. are given. deviation, the more precise is the estimate. As an
These. therefore, are the population parameters that example of this, we have seen that the standerd
it is importdnt 'to be able to estimate, and this
chapter considers methods for doing so. deviation of the sampling distribution of x (i.e.,

7.1.1 Point Estimate~ the standard error of R) is equal to 0/ In. This
becomes smaller as the sample size is increased;
Two types of estimate can be distinguished. One type therefore, large samples give more precise estimates
uses a single number to estimate the population of the population mean than do small ones. It would
parameter and is therefore known as a point estimator. be very surprising if they did not, though it mu.t
For example, suppose a sample of size n has been be noted that the precision only increases as the
drawn from a normal population. Then the sample mean square root of the sample size, i.e., the sample
size has to increase bY,a factor of four in order to
is an estimate of the population mean, u. Similarly, double the precision of the estimate.
the sample variance,
7.1.4 Interval Estimation
is a point estimate of the population variance, 0'.
Since it is apparently a good idea to know how
The ca1culatiol of such point estimates is an obvious precise or accurate, our es .imates of >op~lation
first step in the proces, of estimation, but we have parameters are, the concept of interval estimates
to consider how accurate and how precise they are. has been developed, and it is with these that we
shall be concerned in this chapter. Such intervals
7.1.2 Accuracy of a Point Estimate are bounded by two limits between which we
confidently expect the population parameter to lie;
The concepts of accuracy and precision can be related these limits are therefore called confidence limits.
to the propert'ies of the sampling distributio'ns of
the point esti~ates. As seen in the last chapter, 7.2 Confidence Limits for u. Large Sample Available
these describe how estimators like sample averages
and sample variances vary as different samples are Suppose we are dealing with a population whose mean
selected at random from the population. u is unknown and-is to be estimated. Let the

An estimator is said to be accurate if the mean of variance of the population be 0'. In practice, the
its sampling distribution is equal to the value of
the unknown prpu1ation parameter being estimated. value of 0' will also usually be unknown, but for
Thus, because the mean of the sampling distribution the moment we suppose that it can be found. Imagine
that a random sample is drawn from the population
~f x is equal to the population mean u, the estimator and that the sample mean is xO (the suffix 0 is
introduced to emphasize that xO is the sample
x is said to be accurate or unbiassed. average actually observed in the experiment). Then
XO is an unbiassed point estimate of u. Of course,
On the other hand, if the mean of the sampling it is very improbable that xO is act~al1y equal to u
distribution of the estimator is.!!Q1.equal to the and, even if it were, we should not know this
population parameter, then the estimator is biassed. because the value of u is unknown. It would therefore
It is usually desirable to work with unbiass-ed----- be very imprudent to claim that u is now known to us
estimators, and this explains why the divisor (n-l) exactly. The best we can ,hope to do is to use the
is used in defining the sample variance s' in samp 1e data to produce tvlO 1imits between whi th it is
Equation (7.1) rather than the more obvious n. In expected that u lies; ~ are confidence limits for
the last chapter, it was shown that the quantity
{n-l)s'/o'is distributed like X~-l· Now the ~ean of u., /
a x' dis~ribution is equal to its number of degrees To flnd them, we begin hy considering the sampling
of freedom, in this case (n-l); consequently the distribution of the mea:, According to the results
mean of of Section 6.4, this distribution has a mean u and a

(n-l lo' standard deviation a/ rn. Furthermore, it tends to

(n-l ) be a normal distribution, as shown in Fig. 7.1.
Consider two limits, equidistant from the mean
which are such that the tail areas of the '
distribution are both equal to a. Because LIe
distribution is normal, the distance of these limits

from the mean is uaa/ In, where Uex L the standard

normal deviate corresponding to a tail area of ex.
The value of Uex can be found from Table A2 when the
numerical value of ex is fixed.

The tail are3S outsice the two limits being equ~l EX.'ll1p.1 " !..1..
to '" the area betw~en the limits is 1-2", and this
is therefore the probabability that, when a sample Consid'r th,'daL.Juf Tuble 2.3. which are the
is drawn, its averase iO will lie between the two result, of 192 count tests (in tex units) on a large
delivery of worsted yarn. It was shown in Section
1imits. 3.6. TobIe 3.3. that for these data

and the probability is 1-2c that the unknown ~ lies We shall calculate 95% and 99% confidence Iimits for'
between them. If 1-2c is chosen to be fairly large, the meeo count of the delivery.
say, 0.95 or 0.99. then there is a high probability
that ~ lies between the two limits and we would (a) 95% Confidence Limits
therefore have considerable confid~nce that it does In thi, case, we have to put

so. The commonly used values for (1-2a) are those From Table A2, we find that u~ = 1.96 and Expression
given above. and the corresponding limits are then (7.4) t~en gives for the 95% confidence limits
called 95% confidence limits and 99% :onfidence
limits, respeccively. 31. 13 :':1.96 x 0.79/ 1192
= 31.13:': 0.11.
The limits have been derived on the assumption that
(b) 99% Confidence Limits
the value of a is known; in practice, this.situation To find these limits. we put

very rarely occurs. However, if the sample size is 1-2a 0.99, 9iving
a = 0.005.
reasonably large. say, greater than 30, it is good
Table A2 then gives u•.005 2.58 and the 99%
enough to estimate the limits by replacing a in confidence limits are

Expressions (7.3) by the sampl' value, s. The 31.13:': 2.58 x 0.79/ IT92
=31.13:':0.15.
confidence limits then becane -
7.3 The Interpretation of Confidence Intervals
The way in which confidence limits and intervals are
to be interpreted must be clearly understood. Suppose
we have carried out an experiment and have calculated
95% confidence limits for the population mean ~. If
we now assert.that ~ lies between the limits, then we
are eitl,er right'or we are wrong; we do not know
which, because' the exact value of ~ is unknown to us.
Now the limits are essentially based on the probabi-
lity st,ltement'made in Equation (7.2), and we have
seen in Chapter 4 t~at probabilities are to be
interpreted as the relative frequencies with which
events occur in repeated trials or experiments. Thus
we must imagine our experiment as one of a series of
similar experiments that ~e have done, and will do.
Then, if we always' assert that ~ lies within the 95%
confidence interval. then in the long run we shal I be
right in 95% of the experiments. and wrong in the
other 5~.
A simila- interpretation can be given for 99%
confidence limlts, and for confidence limits for
parameters other than the mean.

7.4 Con;'idence Limits for ~, Small Sample Available
In Section 7.2, it was shown that confidence limits
for ~ are given by

Estimation

In order to calculate the lilllits. therefore. it is IJb 1e 7.1: Calculation of 95% Confidence L imi ts
r,ecesSdry to have a value for, o. In general. the
~"l aV<Jilable estlillilteot 0 IS tilc siJlllplestandard x ::; Percentage moisture content
CAlCV Y1 {] tl'or,l--, < calo<-lated fr'ollltile data ln the salliple.
y ~ 10(x-7) y'
nd to Illakeany p"ogress al all a must be replaced Ly
~",;c the Expression (7.5). Now, generallyspeaklng, 7.2 2 4
tile larger the,SJlIlplc SllC, tile more pr'ec~sc an
estilllator is, 1.C •• the closer saillple eslll1Jatcs tend 8.1 n 121
to cluster round the populat1on value. Thus, 1n large
samples. the estimat'2 s should be reasonably close to 6 36
c, and replacing a by ~ 1n ExpresslOn (7.5) ohould
lead to very Ilttle error. 7,1 4 16

But. if the sample size is small (say, less than 30). 7.8 8 64
the sample estimate l could be qUlte Imprec1se, ,and
'he substitution cuuld therefore lntroduce cons1der- [ Y 31 [ y' 241
;ble error. It can be shown that, provided that the
population is normal, this error can be compensated ~
for by also :eplaclns Ua by the quantlty tk,a, whose
values are t~bulated in Table A4 in the Appendix. The
confidence limits for ~ are then given by

The values of t depend upon a, which in turn depends 31/3 = 6.2 .
on the level of confidence we require, and on the
sample size through the number of degrees of freedom [y '- ([ y)' / n
(k) assoc1ated with t. In the present app11cat10n,
241-(31 )'/5
It vias emphasized above that replacing ua by tk,a is
valid when the population is normal. In practice, it ,48.8 .
is found that the calculation is relotively
insensitive to departures from normality. so that sy' [(y-y)'/(n-l)
the technique can be applied to most practical
prob 1ems. 48.8/4

Examp Ie 7,2 12.2 .
To estimate the average moisture content in a large
delivery of yarn, five specimens of yarn were 1
selested at random from various parts of the
consigrunent, and the percentage moisture content of Sy (12.2);:
each was found, with the following results: Since
7.2,8.1.7.6.7.4,7.8. y 10(x-7)',

Calculate 95% confidence limits for the mean xO 7 + y/10 = 7.62%
moisture of tre consignnent. Sx sy/10 = 0.349% .
In order to calculate the confidence limits from
Expression (7.6), it is necessary first to find the 7.62 ~ 2.78 x 0.349//5
sample mean xO and standard deviation ~. The 7.6c ~ 0.43%
procedure described in Section 3.5 (Table 3.2) is
appropriate, and the calculation for this example 7.5 Choosing the Sample Size
is shown in Table 7.1.
Estimat'ng the mean of a population in the way
It is found that descrited above is one of the most commonly used
Xo" 7.62%. 5" 0.349%. procedures in experimental and production work. The
factory manager will need estimates of the avera')e
Also, since the sample Slze is n " 5, the number of level (length, weight, yield, etc.) of the
degrees of freedom is, fr'ornEquation (7.7), production processes under his control, the research
chemist will want to estimate the mean effect of a
so that a = 0.025. With this value of a and k new treatment he has devised, and so on.
Table A4 glvcS
Oependir,g on circumstances, the estimates requl 'ed
will be of greater or less precision. Quite often,
for example, decisions will be taken on the basis of
the estimates, and wrong decisions can cost money.
Therefo~e, if losses are likely to be high if a wrong
decision is made, a rather precise estimate would be

required. On the nther hand, if the penalty is not ~esults to calculate a sample estimate s that will
likely to be great, an estimate of less p~ecisi0n replace a in the calculation of confidence limits.
would suffice. No"" in;:uitively, the precision of an Because s is a small sample estimate, Ua must also
estimate of any population parameter (as measured, lJe replaced by tk,a. as Section 7.4. The equation to
say, by the .'idth of its conf'dence interval) will lJe solved for !,_ then becomes
depend on the size of the sample on which the
estimate is based; the luger the sample size, the where tk.a has k ; n-l degrees of freedom. The
narrower the confidence interval would be expected to calculation of n then proceeds as described in the
be. The question 'What sample size is needed to give '-ollowinq ('xample.
the requir~d preci,ion7' therefore frequently arises.
The [1recio;ion .!Ctu,ll1y r['Cluin~d in .)11 individu,11 jamp 1e 7.4
case i, not ba~lcally a proble. for stati~tic,; very Suppose the results of Example 7.2 are regarded as
probably economic considerations must Le taken into the preliminary sample and that it is required to
account, as was hinted above. But, once the desired ,!stimate the average moisture content to a precision
precision has been decided on, then some help can be such that 99% confidence limits are: 0.4%. From the
given with the choice of sample size, particularly iata of Example 7.2. we find s ; 0.349%. With this
when the mean is being estimated. value of s, the 99% confidence limits (a ; 0.005) for
The basic expression for calculating confidence u are givin by Expression (7.6) and are
limits for U is given by Expression (7.5), i.e.,
where k ; n-1. We have to choose n such that the
Now suppose that the precisior. of the estimate is ~ imits are
required to be such that the confidence limits are
This i5 an equation for n, which can only be solved
.·'1ere E is a 'tolerance' fixed by the experimenter.
Comparing Expressions (7.5) and (7.8), we see that n numerically by trying different values of n. The
has to be chosen so that
procedure is best carried out systematically, as
E = va. of n0' ,
giving shown in Table 7.2. The first column shows a

The same difficulty as before now arises, namely, ~uccession of values
that this formula requires a know1edge of the value
of a before n can be found, and there are two COlT1Tlon n tn-l,O.OO5 InO. 349tn-l ,0.005/
situations to consider. One is when previous
estimates of a similar kind have been made; in this 5 4.60 0.72
case there may be prior knowledge about the value of 6 4.03 0.57
a, obtained from previous tests, which can be used in 7 3.71 0.49
Equation (7.9). 8 3.50 0.43
9 3.36 0.39
Example 7.3
Suppose a manufacturer reguiarly tests received for n, and the correspondi ng va 1ues of t, read from
consignments of yarn to check the average count or Table A4 in the Appendix are shown in the second
linear density ('n tex). Experience has sh1wn that column. The final column shows the values of the
standard couet tests on specimens chosen at random left-hand side of Equation (7.l0). The calculation
from' delivery of a certain type of yarn usually 1S repeated fOl',increasing values of !!. until the
have a coefficient of variation of 3%. A nominal v3lue in the third column is reduced to 0.4 or less.
36-tex yarn is to be tested. How many tests are
required to make the 95% confidence limits equal to In the example. this happens when n ; 9. Consequently
-+! ?
a tot,l sample size of 9 is suggested, i.e., a
-2 :
further four tests of moisture content are required
Since 95% confidence limits are involved, a; 0.,025
and Ua ; 1.96. Also because the coefficient of to fulfil the requirements. ***
variation is 3%, we find
7.6 Confidence Limits for the Difference between
1000 ; 3, or 1AO Means
u
't is often necessary to estimate the difference
on putting ~ ; 36 since a 36-tex yarn is to be between the means of two populations. An example of
tested. The required tolerance is E ; 0.5, and this occurs when a modification is made to a process;
Equation (7.9) then gives <l measure of the improvement brought about by the
change is then the difference between the mean levels
n ; (1.96x1.08), ; 17.9; of the original and the modified processes.
0.5
L=t the two populations being compared have unknown
The other case to be considered arises when no
prior knowledge about the value of a is available. means u"u, and variances ai, a~. Suppose that
The only possible procedure in these circumstances
is to test a preliminary small sample and use the camples of sizes n, and n2 (not necessarily equal)
have been chosen at random from each population,
respectively, and have provided sample estimate'

---- ~~[----

---'--------,----

1r----- -7-. Estimati-on-------------------------

I
I
~n obvious point estil:late fOl' the difference between The more common situation, hoy/ever, is \'Ihen only

tile population means. ~;l-lJ;, is the difference sillallsamples are available. In order to proceed with

x t ,) - X z:) be tween the Sdll\P 1e means. TIl i spray ides a this casc two assulnptiofl$ are rnade in the theory.
l
startirl9 POillt for tIle derivation of confiderlce
One is that the two populations involved are both

-limits for ~ll-~JZ. ~nlen finding confidence limits norlllal; the other is that they have the sallie

for a single mean, we began with the sanlpling variance, i.e., oY=d l,=a2). Fortunately, it has

dlstribution of its point estimate i, Following the been fbund that in practice there c~n be quite large

same line of arguliient, "e nO\, consider the sampling departures from these assumptions bEfOr<l the

distribution of i.· i" i.e., the distribution of calculated values of the confidence limlts are

values that "auld be obtained if the sampling seriously in error.

mentioned above "as repeated many times, ihe To do the calculation, ~he unknown 0' must, of
x.-i,difference course, be replaced by an estimate, But there are
being calculated at each repeat. t"o estimates available, namely, sl and s~, one from
each sample. Since these are both e~timates of the
There is enough inforliiation in Chapter 6 to deduce same quantity, they may as "ell be combined into a
single estimate s', and it can be shown that s' is
the properties of this distribution. best calculated from

First, we knO\; from the central-limit theorem that

distributions 6f sample averages tend to be normal;
hence the sampling distribution of i,-i, will also
tend to be normal. Also, from Section 6.4, we have

Thus, by using Equations (6.10) and (6.11), the mean s' ~ (11',·1) s j + (n,.l) s~
and vari ance of i, ·i" are nl +.nz-2

~ ~,. Also, since only small samples are involved, Ua must
be replaced by tk,a in expression (7.11) to allow fe,r
a possible error in substituting ~ for o. The
confidence limits therefore become

IJ}:1-X2 \.I llxz - "2

X1 ,

o.2 . , +0:': o~
Xl -X.2 o· x, +
X, n, n,

-"-'-

We have therefore shown that the sampling distribu-

tion of i,-x, is as shown in Fig. 7.2. The derivation
of the confidence limits for w,'w, is ~ow exactly as

before. Two limits can be defined such that the tail

areas outside them are both equal to a, and these

limits "ill be at Ua standard deviations from the

mean. Hence the probability that the observed Examp 1e 7.5

difference x.o-x,. lies between the limits is

A modification was made to a filament-production

process with the object of increasing the extension

at break of the filaments. The results of percentage-

extension tests on the original and modified filaments

were as follows. .'

Unmodified filament: 14.1,14.7,15.1,14.3,15.6,14.8
Modified filament: 16.9, 16.3, 15.9,15.7, 15.7

It is required to calculate 95% confidence limits for
the mean increase in brea king ex tens ion caused by the
modification.

The necessary calculation is shown in full in Table
7.3 and indicates that we are 95% confident that the
increase in breaking extension brought about by the
modification is between 0.61% and 2.05%. If this
confidence interval is too wide, i.e., the estimate
of the mean difference is not sufficiently precise,
it can only be reduced by increasing the sample sizes.

In general, it is best to make n1 ~ n2(=I,) so that
the confidence limits become
Therefore the confidence li~its for ~,-~,
are given by - . ITX,.'X" ~ t2 n-,a2 s -n.

This expression presents the same probleR as before,
namely that to calculate the limits requires a
knowledge of the valu s of the unknown 01 and al' If

0t 03large samples are available (n, and n, greater than

30), then and can simply be replaced by their
sample estimates sl and s~ , and the confidence
limits are given by

r·,:: T.IiJle1.4 gives the step-by-step c~lcul~tion of n
~nd chGWS th,·t the COI fidence interv~l should be
reduced to the required precision if ten

treak';ng extension of modified fil~ments
bre,\kin<j eXlension of unmodified filurnents.

I

n I : 5, n, : G. 6 10 2,23 0.68
7 12 2.18 0.61
LX, 80.5; x, Zx~/nl 80.5/5 16. 1O~ 8 14 2.15 0.57
88,6/6 14. 77~ 9 16 2.12
10 18 2.10 J0.53

0.50

[XZ 88.6 ; x, f.x2/n2

tests al'e performed on each type of filament, i.e.,
anoth;r four tests on the unmodified filament and
another five on the modified filaments are needed.

: 1. 040 : 1.040/4 : 0.2600 .
s( : [(x,-)(,)'/(n,-I)

[(x,-x,)' : [xl - ('~X,)'/n2 : 1309.80 - (88.6)'/6 7.7 Coniidence Limits for Matched Pairs

(n,-1)S! + (n,-I)sj The procedure described in the last section is
appropriate when the sample chosen from the first
nl+fl2-L populat on is independent of the sample chosen from
the second. However, there are many experiments in
4 x 0.26 + 5 x .2946 which this is not the case; in fact, the experiment
5•6- m~y deloberately have been designed so that the two
samples are not independent. In such cases, a
: 2.513 : 0.279 different procedure for finding confidence limits
9 for~l-·.J, must be ador~ed.

Example !...:J....

An inve;tigation was carried out to compare the
average shrinkages obtained when knitted fabrics are
steamed in a Hoffman press when the press is (a)
lock~d, and (b) unlocked. To ma: the experiment
reasonably general, a range of tb. differen'l: fabrics
was used. Each fabric was divided into two pieces;
one of the pieces was then chosen at random and
steamed in the unlocked press, the other piece being
steamed with the press locked. The resulting area
shrinkages (values of x, and x,) are given at the
top of Table 7.5.

Now her' the samples are not independent. The

results are in pairs, each pair referring to a

different fabric, and these fabrics have different

tendencles to shrink. For example, Fabric E shrinks

1.33 , 0.72 much mor~ than Fabric B. whether the press is locked
(0.61, 2.05)'0
or unlocked. In these circumstances, the data are

sa id to be carrel ated and if the method of Sect ion

7.6 wer,· used to find confidence limits for ~l-~',

the res',~ting limits would allow for this variation

in tend~ncy to·shrink. But, in this experiment, this

variation is of little interest (the fabrics were

deliberately chosen to be different); what is

importa·,t are the ~iff~ in ·.hrinkage for each

fabric. Thus, if y = x,"x, is the difference in

shrinkage for a fabric, what is required are

confidence limits for ~y, the average shrinkage

difference, and the method of Section 7.4 applies.

II :In't 2 s E The calculation 0f 95% confidence limits for ~y is

2 n- ,Q given in full in Table 7.5 and shows that the

shrinkac.Q in an unlocked press is greater than that

an equation for n thdt can be solved in the manner in a locked press, the average increase probably
of Example 7.4.
lying brtween 0.12% and 0.8%. • ••

Equation 7.6

It is required to reduce the confidence interval of

Example 7.5 to ;(10 - X'D ± 0,5. Assuming that
n, = n, = n, and pu ttir'9 s = 0.528, we find tha t
Expression (7.15) is, for 95% confinden~e limits,

t20-2,O.025 A:x 0.528 x 0.5

-------------------------- ~l~ [

Table 7.4 gives the ~tep-by-step cJlculation of n

and chews th,'t the cor fidence intervJl should be

reduced to the required precision if ten

treak';ng extension of modified filaments
breaking extension of unmOdified fllurnents.

Xl: i6. 9, 16.3, ) 5. 9, 15.7, 15.1 14.8 6 10 2.23 0.68
'15 ..1, 14.3, 15.6, 7 12 2.18 0.61
xz : 14.1 , 14.7, 8 14 2.15 0.57
9 16 2.12
~n, 5, n~ " G. 10 18 2.10 J0.53

0.50

I:(x,-;(,)' "eX; - (I>,)'/n, 0 1297.09 - (80.5)'/5 tests a,'e performed on each type of fi lament, i.e.,
anoth:r four tests on the unmodified filament and
another five on the modified filaments are needed.

" 1.040 " 1.040/4 " 0.2600 .
s; " [(x,-x,)'/(n,-1)

[(x,-x,)' ";:x~ - (;:x,)'/n, " 1309.80 - (88.6)'/6 7.7 Confidence Limits for Matched Pairs

The procedure described in the last section is

appropriate when the sample chosen from the first

populat on is independent of the sample chosen from

the second. However, there are many experiments in

(n,') )s; + (n,-1 )S) which this is not the case; in fact, the experiment

nl+fl2-( may deloberately have been designed so that the two

4 x 0.26 + 5 x O. 2946 . samples are not independent. In such cases, a
5+6 ' 2
different procedure for finding confidence limits
" 2.513 " 0.279
9 forw,", must be adopted.

Examp Ie !...:2

An inve.;tigation was carried out to compare the

average shrinkages obtained when knitted fabrics are

steamed in a Hoffman press when the press is (a)

lock~d, and (b) unlocked. To ma: the experiment

reasonably general, a range of tb. different fabrics

was used. Each fabric was divided into two pieces;

one of the pieces was then chosen at random and

For 95% corcidence 1 imits, t.,".025 = 2.26. steamed in the unlocked press, the other piece being
The 95% confidence limits arc
steamed with the press locked. The resulting area

shrinkages (values of x, and x,) are given at the

top of Table 7.5.

Now her·' the samples are not independent. The

results are in pairs, each pair referring to a

different fabric, and these fabrics have different

tendenCies to shrink. For example, Fabric E shrinks

1.33 : 0.72 much mOr~ than Fabric B. whether the press is locked
(0.61, 2.05)~,
or unlocked. In these circumstances, the data are

said to be correlated and if the method of Section

7.6 wen: used to find confidence limits for W,'W"

the resli'ting limits would allow for this variation

in tendency to·shrink. But, in this experiment, this

variation is of little interest (the fabrics were

deliberately chosen to be different); what is

importa·,t are the Q.iff~ in :hrinkage for each

=fabric. Thus, if y x,-x, is the difference in

shrinkage for a fabric, what is required are

confidence limits for wy, the average shrinkage

difference, and the method of Section 7.4 applies.

The calculation of 95% confidence limits for wy is

given in full in Table 7.5 and shows that the

shrinkac.e in an unlocked press is greater than that

an equation for ~ that can be solved in the manner in a locked press, the average increase probably
of Examp 1e 7. ~ .
lying bptween 0.12% and 0.8%. ***

Equation 7.6

!t is required to reduce the confidence interval of

Example 7.5 to i,o ' i,o ± 0.5. Assuming that
n, " n, " n, and putting s = 0.52B, we find that
Expression (7.15) is, for 95% confinden~e limits,

t2n-2,0.025 A"x 0.528 X 0.5

~_n obvious point estil:late fOI' the difference between The Inore COilYJlonS itua tion, hOViever, is when on 1y
small samples are available. In order to proceed with
x.:the population means, I.;-\1;, is the difference this case, two assumptions are made in the theory.
One is that the two populations involved are both
Xl') - v bet\'o'een the ::>tliliple means. This p,"ovides d
starting point fur th~ derivation of confidence ornorma 1; the other is tha t they have the salre

limits for >,1-"'. When finding confidence limits variance, i.e., =o~ 1=02). Fqrtunately, it has
been fbund that in practice there c~n be quite large
fer a single meon, we began with the sampling departures from these assumptions bEforl~ the
calculated values of the confidence limits ore
distribution of its point estimate i. Following the seriously in error.

same line of argument, we now consider the sampling To do the calculation, .he unkno"n 0' must, of
course, be replaced by an es tima te, But there are
distribution of i,' i" i.e., the distribution of two estimates available, namely, sl and s~, one from
each sa~ple. Since these are both e~timates of the
values that would be obtained if the sampling same quantity, they may as Vlell be combined into a
single estimate s', and it can be shown that s' is
mentioned above was repeated many times, the best calculated from
x,-x,difference
being calculated at each repeat. s' = ("1-1 )s1 + (n,-1 lsl
nl +.n2-2
There is enough information in Chapter 6 to deduce
Also, since only small samples are involved, Um must
the properties of this distribution. be replaced by tk,a in expression (7.11) to allow fer
a possible error in substituting ~ for v. The
Fi rs t, we kno" from the centra 1- 1imi t theorem that confidence limits therefore become
distributions of sample averages tend to be normal;
hence the sampling distribution of Xl-~' will also
tend to be normal. Also, from Section 6.4, we ~ave

Thus, by usinS Equations (6.10) and (6.11), the mean
and variance of Xl-X~ are

\l~1-X2 U -U = ", ",
X1 X2
,
, , 2
Ox 1 -X.2 ~
0_ +o~x, + '!2
X, n, n,

We have therefore shown that the sampling distribu-
x,-x,tion of
is as shOl<O in Fig. 7.2. The derivation

of the confidence limits for \1,-U, is now exactly as

before. Two limits can be defined such that the tail

areas outside them are both equal to a, and these

limits "ill be at Ua standard deviations from the

mean. Hence the probability that the observed Example 7.5

difference x,o-x,o lies between the limits is A modificJtion was made to a filament-production
process with the object of increasing the extension
at break of the filaments. The results of percentage-
extension tests on the original and modified filaments
were as follows.

Unmodified filament: 14.1,14.7,15.1,14.3,15.6,14.8
Modified filament: 16.9, 16.3, 15.9, 15.7, 15.7

Ii 22 It is required to calculate 95% confidence limits for
°1 °2 the mean increase in breaking extension caused -by the
if -+- modification.
nn
I!I - The necessary calculation is shown in full in Table
l2 7.3 and indicates that we are 95% confident that the
'ii~ increase in breaking extension brought about by the
Therefore the confidence liwits for v,-v, modification is between 0.61% and 2.05%. If this
III confidence interval is too wide, i.e., the estimate
are given by of the mean difference is not sufficiently precise.
I it can only be reduced by increasing the sample sizes.
This expression presents the same problem as before,
j. namely that to calculate the limits requires a In general. it is best to make n1 = nz(=I') so that
kno,,'edge of the valu s of the unknown 0, and 02' If the confidence limits become
I large samples are available (nl and n, greater than
- - 11X10-k,o ! t2n-,o2 s -n.
Il of30), then and o~ can simply be replaced by their
I.I'
sample estimates Sl and s~ , and the confidence
I,!' limits are given by

:1

't

II
i
I

ii

The L1t:,:lVdllOIl of confidence 111ll\ts for {H1Y populil-
ll111l IJdrlll!ll' lL'I' a 1\'h1Y<'> I u I I ()V,~ Lhe ~l.1llle 11m:, We h,t Vt
Seel1 lh<.lt.
,II'ed '.l;;'llIkJ.ljL' III unlocked l're~~:
Let ..... .•. ,\ rt'.\ .Ilr lll~ .lql' 111 lut kt'd 1'1'\",'"

j', .ll~ lIn!'I"'~',cJ PUIIIL 1:~,Li:ihll(J)' 01 LIlL' I'opultlliull

v••rl~l\(e ,,'.He therefure beqin uy cOl\oidering the

lllbr lL 1\ l IJ Sdmplll1(] distribution of s~ lf1 s(.~mples of size fl.
Xl
X, O. i.7 2.7 1.0 4 ,. which for normal populations was shown to be relate.
.J
Y='(1 ~X2 to the x' distribution in Section 6.5. In fact,
O. 0.5 2.0 0.5 3.7
(n-l)s'/o' has a samplinq distribution that is
-0.3 0.9 0.7 0.5 0.8
identical with that of x~, with k = n-l degrees of

freedom. This distribution is shovlO in Fig. 7.3.

Fabric FG H IJ Just as in dealing with averages, two limits can be
defined, which are such that the tail areas outside
Xl 0.5 3.8 1.8 3.1 1.2 them are equal to a, as shown, and these limits can
x, 0.9 2.9 1.3 2.3 1.0 be found from Table A3 in the Appendix. The values
in this table are such that the area to the right of
y=Xl -x, -0.4 0.9 0.5 0.8 0.2 a tabulated value is equal to the probability at the
head of the column in which the value appears. Hence
n = 10; ry = ,6 ~ y = 'Ly/n = 4.6/10 0.46% the upper limit in Fig. 7.3 is X~-l,a, since the area
to the right of it is equal to a. As for the lower
, limit, the area to its left, shown shaded, is (1, so
that the area to its right must be I-a. Consequently
[y = 4.18; the lower limit is X~-I,I-a'

[(y-y)' = [y' -([y)' /n = 4.18-(4.6)' /10 2.064; Now suppose that a single sample is dral'" from the
population and that the variance calculated from this
s I c(y-y)' /(n-l) = 2.064/9 = 0.2293; sample is s~. Then the probability that (n-l)s6/o'
y lies between the two limits defined above is "qual
to·the unshaded area between them, i.e., to 1-2(1.
Hence

(n-l )s~

-0'- < X~_l ,0)

(n-l is;
-0-'--

7.8 Confidence Limits fora' and for a ,(n-l ) s;

In the last few sections, methods for calcul .ting xn-1,1-a
confidence limits related to mean values ha~e been
dealt with. While it is often very important to have and, if
a good estimatE' of the mean of a population, and to
know how precise it is, there are situations in (n-1 )s a
which we are just as concerned about the variation
exhibited by a population. For example, a spinning -0-'--'(
process can be producing yarn of exactly the correct
linear density, but the y~rn will be quite useless then
if its r~gularity, i.e., the variation of linear
density "long its length, is poor. Since variation ,!n-1 )s; < 0' (7.16) is equivalent to
is mast frequently measured by the variance or by its
equivalent, the stanoard deviation, it is desirable Xn-l,o , (n-l)s; J=I-2ex
to be able to estimate these quantities with some
confidence. Hence, Equation X~_l ,n-c.

saPr-(,(-n--1<o) <

Xn-1,a

so the confidence limits for a' are
(n-1 isij and (n-l )sa

X~_l ,0 X~-I,l-a

x2n-l 1-(1 in-l. r,.

T king square roots, the confidence Iim; ts for the 7.9 Confidence Limits for the Ratio of Two Variances
add devlatlOn are t.herefore
stan ar In Section 7.6, we compared the mean values of two

populations by computing confidence 1imits for their

differ~nce. It turns out that, in order to compare

n-l ! the variabilities of the two populations. it is best

s (---) , to con,ider confidence limits for the ratio of their
.., xr~_ I , I-It .
varian·cs. --

, ord of "arni ng is necessary. These 1imi ts have Let till two population, be normal. with variances
"wn derived by assuming that the origir.al of and oj, and suppose that independent random
b~~ulation is normal. Departures from this samples of sizes nl and n2 have -aen chosen. leading
~ssumPtion can.lead to serlOUS errors ln the to sam~le variances Sf and sj. If we imagine this
confidence l1ml,s for varlances and standard
deviations. sampling procedure being repeated many times. it can
be shown that the quantity
~~
si/o~
;fin Example 7.2, the results of five. determinations s~/oi
p€rcentage moisture content on a consigrvnent of
yarn were given ..Their standard deviation was found will va,y from sample to sample and will have a
to be 0.349~ It lS requlred to flnd 95% confidence distrih"tion called the F-distribution. There are
limits for the standard deviation of moisture content infini:ely many F-distributions, depending on the
in the delivery. sample ~izes involved, and the dependence is thr9ugh
two va 1.res for degrees of freedom, k 1 and k" where
In this case, the sample size n = 5 and so=0.349. To
find 95% confidence limits, we set 1-2a=0.95, giving
c = 0.025. We therefore find, from Table A3,

The 95% confidence limits for 0' are therefore, from A typic11 F-distribution is shown in Fig. 7.<
Expres sion s (7. 17 ),
Table A5 gives values of Fk"k" a, which are such
4 x 0.349' that th~ area under the distribution to the right of
0.484 this linit is equal to a, as shown in Fig. 7.4. Also
shown in this diagram is the limit of a left-hand
tail area equal ~o a. The area to the right of this
limit i, therefore I-a, and the value of the limit
is thus jenoted by Fk, ,k"l-a' However, it can be
shown tnat

(note the reversal of k1 and k,); hence it is only
necessary to tabulate values of F for right-hand tail
areas.

0 Fk 1 ,k2,1-0. F k1 ,k2,a •.
F
k2, k 1 ,0 2 / 2
S1 1
°rr

2 / 2
S2
°2

rL·--_-_-_------------------I 5~

7. Estimation

lhlS :l\Il11pl.llq dlc,trlt)ulIOIl tlt.lVlnq bl:tll (lULd incd, t'll.:' ,Ie ,lI'eLhu~, ~J~X confideliL LhaL the'true'or

der'ivJlio/l 01 CUll! idC:llce !llUils fur the r~tiu s~js~ PUIJUldtioli rdLio of vuiances lie, betVicen the

pJr'i:.lle)s previou~ urgul1lt:l.ltS. 11' sio ,and sio are the values 0.86 and 3.76. Since this rdnge includes the

sample variances In the sIngle experIment actually va,lue 1.0, which would be attained if af;ol, the

performed, then the.probability that(sro/ar)l(s~,la~) present data dre insufflclent to conclude that one

1 ies wi thin tile 111111 ts sh,)"n In FIg. 7.4 IS manufacturer's product is more variable tlan the

otJlcrls. ***

7.10 Confidence Limits frr a Proportion

stolai A situation tnat frequently arises is when the
s~%~
quantity to be estimated is a proportion, or a
Rearranging the i1equalities, we obtain the equiva-
lent probability statement percentage. Examples are the proportion of defective

articles produced by a manufacturing process and the

proportion of black fibres in a large consignment of

mixed black and white fibres. To estimate such

parameters., a sample of size n wou 1d be se 1ected at

random (articles or fibres in the above examples) and

the number of'successes'(defectives, black fibres),

Xc counted. A point estimate of the proportion of

successes in the population is then

s~o S~oFk2 .k1 ,0 Howev~r, it js usually desirable to calculate limits
s~oFkl,k2,Q s~o between which we confidently expect the population
or'true'proportion p' to lie.
Again it must be emphasized t1at these limits apply
only if the distributions are normal. Following the general procedure for finding
confidence limits, we begin by considering the
Examp 1e 7.9 samp.ling distribution of the point estimate. Now
the circumstances in which the sample was selected
A retailer buys garments of the same style from two are exactly those in which the binomial distribution
manufacturers and suspects that the variation in the applies (see'Section 5.4), i.e., se'ecting a member
masses of the garments produced by the two makers is of the sample can be regarded as an irdependent trial
differont. A sample of size 20 was therefore chosen and the probability of a'success'at any trial is
from a batch of garments produced by the first constant and equal to the proportion p' of successes
manufacturer and weighed. The resulting sample in the population. Hel, 0 this distribution of the
number of successes x in a sample of size n is
variance was sl, = 25.0 grams'. A sample of size 25 binomial, with mean and standard deviation given by
Equations (5.9) and (5.10), i.e.,
was chosen from a consignment sent by the second

=manufacturer, the sample variance being sic 14.1

grams' .

We shall compute 95% confidence limits for the ratio

al/ol. In this case, n, = 20, n, = 25, so that k1 =
19, k, = 24, and we shall require

Entering Table A5, "e find that values of F for 19
dcg,-ecs of freedom are not tabulated. However, rough
interpolation 4n the table is sUfficiently good, and
we then find that

Fig. 7.5 Distribution of the nu~ber of
successes, x

Thus, on sUbstituting these values in Expressions Further, provided that n is large enou~h, we saw in
{7.l9), the 95% confidence limits for al/o~ are

Section 5.7 that the binomial is close to the norm31

distribution. Consequently, the distri"ution ef x is

25.0 25.0 x 2 _12 , as sho',m in Fig. 7.5. As usual, t"iO Ii its car. be
14.1 x 2.05 14.1
defiGed) rihich are such that the toil areas outsid~

them are buth E:qua 1 to ':1. Hence tne prGbabi 1ity that

the observed number of successes Xo lies between the
1imi ts is

where un i ~ UlI' ., l.lIld,Ii'd IH)I'llhll dc-v i.-\ tt' (UlTt",POlld i Il~l Ixamp~
to d tail an~J 'to on the left-hand side is
A u~l'ful tcchllil1uP tlh)t is sornctill1c$ used to
;,0W the I irst inequality '!~till,,'ttehe Imrkillg efficiency of a machin,· (or an
operative) is the ratio-delay or snap-readir,g method.
0'-, by using Equations (7.21) and (7.22), The Irachine is observed on many randomly chosen
occasions, and whether or not the machine is working
"p'-ua,~n(pI-:"\ -p. c xo . at tr.e instant of observation is recorded. The ratio
of the numher of occasions on which the machine was
work,nq to the to I.) I nllmher of ohservations is then
a !IIe,lwre of the lIlachlne efficiency, This method has
the ldvantage that the machine's operative is not
observed continuously and is therefore likely to
work in his or her normal manner.

In on: application, a machine was observed 1000 times
and was found to be "orking on 824 occasions. In the
notal ion of this section, we therefore have

>Jl. _ u /p'(I-P') .

na n

Hence the above probability statement is ~quivalent or 82.4%.
to However, it is a1"ays useful to know how precise an
estimate is, so we proceed to calculate 95%
Pr{~ - uex~(1--p-n-p-.)" p' " .x..Q. +'J ~(1_-1_'_ 1''_J ) confidence limits for the machine efficiency by using
\n na n Expression (7.23). Before this is done, though, it is
neces,ary to check that Condition (7.24) is satisfied
which shows that the confidence limits for the i.e., that the basic binomial distribution can be
population proportion 1" are approAimated to by the normal. Putting Po = 0.824 in
Condition (7.24) shows that the limits can be
Xo u /P'(l-p') • calculated by using Expression (7.23) if

an 9xO.107.68=2442.1.

A difficulty remains, in that this expression Our s"mple size of 1000 is far greater than this;
contains the unknown population. proportion 1". It is therefore Expression (7.23) can be used to calculate
sufficiently accur~te in most practical applications confidence limits.
to replace this by t.he observed proportion PO = xn/n,
so that the confidence limits become To find 95% confidence limits, we put 1-2a = 0.95.
or a = 0.025. and find that ua = 1.96. The Expression
(7.231 then gives for the 95% confidence limits

0.82':?: 196 / C.824 x 0.176
. 1000

It must be emphasized that these limits have been
derived by assuming that the binomial distribution
can be approximated to by the normal. Thus the sample
size used must be sufficiently large; according to
Section 5.7, we must have

Po 1-1'0 • 7.10.1 How Many Observations are Needed?
n > 9max ( --, -- ) •
Before embarking on an investigation like that
I - Po Po described in Example 7.10, it is necessary to know
how ,"",nyobservations should be made. How many are
necessary depends on (i) the precision required in I
the estimate, and (ii) the likely value of PO. In
many situations, there is some prior knowledge about I
the ~;'obable value of PO' and the precision required
is a Inatter for the investigator to decide. 5f. J

1!:t:-------------

7. Estimation

Suppose thilt, bcfol'c tllC invcstigiltion of EXJlliple Consequently, in order to find confidence limits for
7.9 was started, it IViiSknown from past experience P; -p; , "e need to consider the sampling distribut-
tllat the IIIJchine cfhcicilcy "as likely to be about ion of point estimates P,oP"~ of which Equation
no% and that il confidence interval of i 2.5% "ould (7.26) is the observprl v31uB in a particu1al' experi-
be sufficiently precise. Thus we have guessed that
PO ' o.So, and we require the 95% confidence limits ment.
to be PO " 0.025. Therefore, using Expression (7.23)
"e ser tha t IVe requ ir'e In repeated samplin9 of the kind envisaged here. the
number of occurrences of the event would be expected
1.96 I /0.-8-n-x--0.2 < 0 .02"", to vary accol"ding to a binomial distribution. Hence

I' .96) 2 x 0.8 x O. 2 and similar results can be written down for p,.
Therefore, using the results of Section 6.3, we have
n>
(0.025)2 p ; ( 1 - P j) + .PiJJ..=J8

It "as therefore decided to make 1000 observations. nl n2

If there is no prior information about the possible
value of PO' then a small preliminary experiment
should be carried out to get a rough estimate of PO.
The procedure for choosing the sample size is then
exactly as described above.

7.11 Confidence Limits for the Difference between
Two Proportions
Consider the following.

[xamp 1e 7. 11

The sewn joints joining the webbing components of a
parachute harness must obviously have a s~'ength
gr0ater than a specified safe minimum. A new method
01 sewing the joints had been developed with the
object of reducing the number of joints that would
fail a strength test. One hundred and twenty joints
of the old type were testeD, of which 11 f~i1ed. One
hundred joints of the new type were also tested in
the same way and five failed. It is required to find
99% confidence limits for the difference in failure
rate of the two types of joint.

This is an example of a general problem, which is
to find confidence limits for the difference Pl-p"
where Pl and p, are the prooortions of times a
certain event occurs in two independent populations.

Suppose the data available are the numbers of
occurrences x,o and x,o in samples of sizes n, and
n, drawn at random from the two populations. Thus,
in the above example,

If the sample sizes are reasonably large, the
binomial distributions can be approximated to by
normal distributions, and then the sampling distri-
bution of Plop, shown in Fig. 7.6 will tend to be
normal. Following the standard procedu:e, two limits
are defined as shown, which are such that the tail
areas are both equal to a. Then the probability that
the observed differe,lCe PlO-P20 lies between the
limits is 1-2a, i.e.,

7. Estirllalion of EXJmple Consequently, in order to find confidence limits for
pi -pi, we need to consider the sampling distribut-
r:=-If ion of point estimates PI-P" of wilich Equation
(1.26) is the observprl value in a particulal experi-
II Suppose tilat, befol'c tile investigation ment.
7.9 was started, it l;ilS known from pJst experience
In repeated sampling of the kind envisaged here, the
tlldt the IIhlcilincefflciency 'iJS likely to be about number of occurrences of the event would be expected
to vary according to a binomial distribution. Hence
80% and that a confidence interval of ! 2.5% would

be sufficiently precise. Thus we have guessed that

PO ' 0.80, and we require the 95% confidence limits

to be PO ,0.025. Therefore, using Expression (7.23)

we ser that we requll'e

n II .96)2 x O~
>.L -

(0.025)2

It was theretore decided to make 1000 observations.

If there is no prior information about the possible
value of PO' then a small preliminary experiment
should be carried out to get a rough estimate of PO.
The procedure for choosing the sample size is then
exactly as described above.

7.11 Confi dence L imi ts for the Oi fference between and similar results ran be written down for p,.
Two Proportions Therefore, using the results of Section 6.3,we have
Consider the following.

Examp 1e 7. 11

The sewn joints joining the webbing components of a

parachute harness must obviously have a su'ength o =;;;Z~

gr~ater than a specified safe minimum. A new method P,-P, / 'P, p,

01 sewing the joints had been developed with the

object of reducing the number of joints that would

fail a strength test. One hundred and twenty joints

of the old type were testeo, of which 11 f~iled. One

hundred joints of the new type were also tested in Pi(l-pi) + Pi(l-pi)
nl nz
the same way and five failed. It is required to find

99% confidence limits for the difference in failure

rate of the two types of joint. ***

This is an example of a general problem, which is
to find confidence limits for the difference p, -p, ,
where p, and p, are the proDortions of times a
certain event occurs in two independent populations.

Suppose the data available are the numbers of
occurrences x" and x" in samples of sizes n, and
n, drawn at random from the two populations. Thus,
in the above example,

If the sample sizes are reasonably large, the

binomial distributions can be approximated to by

normal distributions, and then the sampling distri-

bution of P'-P2 shown in Fig. 7.6 will tend to be

normal. Following the standard procedu,'e, two limits

are defined as shown, which are such that the tail

areas are both equal to a. Then the probability that

the observed differe'1Ce PIO-P20 lies between the
limits is l-2a, i.e.,

TillS probability sLiltement can be rearranged. as in 1. A .ample of 150 specimens was tested for carbon
the last section. to read cnntent. The mean of the test results was 4.41 and
their standard deviation was 0.18%. Calculate 95%
from which we deduce that the C0nfidence limits for and ~q~ confidence limits for the mean carbon content
p; _po are of the population.

----jPIIf;uaI-pii()+ 2. In setting up a new pension scheme, a large
n, company needs to know the average age of its
empl0."ees. Records of 100 employees were selected
ThiS expression includes the unknown population at random, .nd the average age of the sample wa~
values pi and p, , which must be replaced by their found to be 36.2 years, with a standard deviation of
sample estimates p,o and P20. The confidence limits 5.3 years. Calculate 95% confidence limits for the
are therefore finally written as mean <ge of the cOlllpany'semployees.

Plo(1-P10) + p2o(I-P20) 3. One hundred pieces of a certain yarn were tested
for stren9th, the mean value of the test results
"1 "2 being GO N and their standard deviation 3 N.
Calculate 95% confidence limits f,r the'true'ne~n
Example 7.11 (continued) strength of the yarn. If it was required to e;timate
For Example 7.11, the·data show the'true'mean correct to V2 N. how many tests should
be per'formed?
Since it is required to find 99% confidence limits,
we put 1-2a = 0.99. leading to a = 0.005. Table A2 4. To estimate the mean count of a delivery of yarn,
then gives UO.005 = 2.58, and the confidence limits five ~;tandard count tests were carried out, with the
follo,ling results (tex)
0.0917-0.05 ~ 2.58 jQ.09171~00.9083 + 0.051~00.95
Calculate ~5% confidence limits for the mean count.

If i~ was required to estimate the mean count to
within 0.15 tex (.ith a 95% chance of being correct),
how ma~y tests would be needed?

5. Strength tests were carried out on a yarn with
the fo;lowing results (strength in gf).

Calculate 99% confidence limits for the mean strength
of the yarn.

6. In an activity-sampl ing investigation, an
operative is observed randOlllly,and, at the instant
of obsl'rvation, it is noted whether the operative is
working or not. A particular operative was o~served
100 times and was found to be working on 70
occasions. Calculate 95% confidence limits for the
proportion of time he works.

It it was required to estimate ..e proportion
correc~ to an accuracy of + 0.02, how many observa-
tions ~hou1d be made? -

7. A random sample of 500 garments was chosen frOllla
production line, and 25 of them were found to be
'seconds'. Calculate 98~ confidence limits for the
proportion of 'seconds' produced by.the line.

8. Usicg the data of Question 5, calculate ~5%
confid~nce limits ~or the standard deviation of yarn
stren9th.

There is therefore a very high probability (0.99)
that the'true'difference in failure rate between the
two types of joint lies between -4.7% and 13.0%.
Since this range includes zero, we should be inclined
to feel that the modification had made little or no
difference to the failure rate of the joints.

8.1 Int'"0ductl~~ each one appropriate to a particular situation, but
the argument underlying them all is i 'enlical. fie
~ the previous chapter, one of the fundamental shall therefore consider the development of one test
problems of statistical inference was considered, in considerable detail; the others can then be dealt
namRly, that of esti,natin9. th", values of the with more briefly.
pa,ameters defining a population distribution. The
techniques dlscussed there are especlally useful Before doing so, however, a word of "arning is
when there is no prior information or knowledge about necessary. In the 1as t fe" years, a tendency seems
the value of the parameter being estimated. However, have developed to use significance tests almost
there Me mar.y si tuat.ions "here such knowledge indiscriminately, often unnecessarily, and sc,netimes
exists, ei ther in the form of past experience or as wrongly. The reason for this probably is that the
a theory or hypothesis about the population values. tests are often quite easy to apply, and seem there-
In these cases, the primary interest may be whether fore to offer an attractively simple, apparently
past experience still applies, or "hether the theory 'scientifi:'and admirably objective way of evaluating
is a redsonable one; in other words, we are lnteres- data. The truth is that, used improperly, they can be
ted in possible departures from the status quo. dangerous; they are not a substitute for clear
thinking about the problem at hand, nor can they
An example of this occurs "hen a modification is produce valid general conclusions from poor data. To
made to a manufacturing process. Suppose the process be used effectively, their purpose, the means by
usually makes 4% defective articles (this is which they achieve that purpose, and their limita-
knowledge gained from past experience), and imagine tions, must be thoroughly understood.
that a change is made to the process with the object
of reducing the proportion of defective articles 8.2 Test for a Single Mean: Large Sample Available
produced. After the modification has been made, a
random sample of 100e articles is examined, and 34 We be~in by stating a particular problem.
of them are found to be defective, i.e., the
proportion of defectives in the sample is only 3.4%. Examp 1e 8. I
The question that now arises is 'Is this sufficient
evidence t.o show that the modification has been A certain style of garment had been produced for some
successful?' Or, rather more precisely, 'Is this
sufficient evidence to conclude that the population time, and the masses of the finished garments were
proportion of defectives is now less than 4%?'
known to have a mean of 260 g and a standard
As a second example, consider the follo"ing. A
certain theory indicates that the average tenacity, deviation of 8 g. A modification was made to the
Sy' of a continuous-filament yarn is related to the
average tenacity, Sf, of the filaments and to the finishing procedure, and a sample of 40 garments
yarn twist by an equation of the type
finished by the new process had a mean of 263 g. Is

this sufficient evidence to conclude that the new

finishing procedure reduced the mean mass loss during

finishing? ***

IIII where u is the yarn-t.wist angle, whose value is This is typical of the situations in which a
II determined by the amount of twist in the yarn. To ;ignificance t.est for a single mean is appropriate.
test this theory, twelve yarns were spun, from three In these cases, the experimenter has to decide
I different type, of filoment at four different values bet.ween two possibilities: either the mean of c
I of twi st. After bei ng spun, the yarns were tes t.ed, population has changed from a specified value ~O' or
and for each yarn the ratio 't has not changed. So far as Example 8.1 is
II concerned, the population mean finished mass could
z ; Sy/Sf cos'", have remained at 260 g (i .e., the new finishing
I process had no effect), and the larger sample mean
.was calculated. Now, if the theory is corrett, the of 263 g could have arisen by chance, or the new
,I values of z would be expected to cluster round the process really had increased the finished mass of the
value unity, i.e., we should expect the population garments and the sample mean is a reflection of this.
.\1 mean value of z to be
~.2. 1 Hypot.heses
I However, the mear ratio for the yarns actually spun
(which can be regarded as a sample from the Tnese two possibilities can t2 formally ;tated as
,'I population of yarns that could be spun) was
hypotheses. The first (that the new proc,ss has had
i This is different from the expect.ed value, and the
question arises 'Is the difference sufficiently large n~ effect) is called the null hypothes'is and is
I for us to conclude that the theory is not correct?'
Questions like those posed in the above examples are d2noted by HO. Thus, in general, -
fi I answered by means of sign~ficance tests. In general,
it is supposed that there exists a hypothetical or Tlie second possibility (that the new process has
'I,\ theoretical value for a population parameter, and it increased the mean finiShed mass) is called the
1I is required to accept the hypothesis, or to reject it alternative hypothesis, denoted by Hl' i.e.
by using t.he evidence contained in a sample chosen
j fronl the population. Several such t.ests are available, The evidence available for deciding between these
hypotheses is contained in a random sample of size n,
",hrJsemean is xO' (In Example 8.1, n ; 40 and xO ;
26c g.) For the moment, we shall suppose that the
standard deviation of the population, cr, is unchanged
by the modification.

- ? A General pl'inciple The c~rresponding value of a can then be found from
82.t"~ Table AI.
~ al principle for developing a test to decide
f'-i1.'Il.fl.I~fl..:..l.
The genrH or H, is the more acceptable is similar
wh,Ch 0 d&oted in J Driti'h court of law. namely. r or L amI' I e II. I, >Ie hJ v e
th3 t il , . d . .I' ,
to h dcff'rlll.H\t <; ~1~SUllll' Innocent untl It 1';

thatn t~nd_!~!~~I':.O~!~lh!_('~'-91!_bf. ':ht1t h~ i~ not innocent
shO\hel'efore gui! ty. 1I'30olot,09 tillS ,nto teot of

and 'f'cance terms, the null hypothesis is assumed to

S,g~~m~ssible until it is shown that the evidence

be, tit conta'ined in the sample, could not

ag3,ns'blv'hilve occurrrd by pure chance.
rp.ason" -'

Consequently, we ;~ust s~e~ J measu~e of the.chance, 263-260 c 2.37 .
or probability, or obtaln,ng the k,nd of eVldence 8/ 140

supplied by the sample on the assumption that. the. Reference to Table Al then shows that a = 0.0089.
null hvpothesls lS true .. If thlS probab,llty 's h'gh, There is tI'erefore a very low probability (less than
l~) OF getting a sample average as big as 263 if the
--cIo not have enough ev,dence to reject (,.e., null hypothesis is true. Consequently, we should
rejec~ the null hypothesis and conclude that the new
~~nvict~) the null hypothesis. On the other hand, if finis~ing process did indeed reduce the mean mass
the probabil,ty lS low, then reasonable doubt has loss during finishing.

been raised aga,nst the null h)pothes,s, wh,ch ,s

therefore rejected and the alternative accepted.

It must be emphasized that this procedure can never ~.The Interpretation of a Significance Test
prove conclusively that a null hypothesis is true.
When a null hypothesis is not rejected, all that is The nell hypothesis for Example 8. I was rejected
being said is that there is not sufficient evidence because the significance level was low, i.e., there
to reject it. In a practical situation, however, we was a low probability (less than 1%) of getting a
should for the time being (i .e., until more evidence sampl, with a mean as high as 263 g if HO were trUl~.
is available) act as though it were true. Since the significance level was less than 1%, the
sample result is said to be'significant at the 1%
8.2.3 The Significance Level level', In this example, there ",as little dOlbt that
HO should be rejected because lhe significance level
The probability mentioned above, of getting the kind was as low as this.
of ~vidence supplied by the sample when the null
hypothesis is true, is called the significancE level But s'Jppcse the sample mean had been xO = 251.5,
of the data, and it is upon its value that the rather than 263. The calculation of U would then
interpretation of a significance test rests. In the have been
present case, because we are dealing with averages,
the significance level is found by considering the 261.5-260
sampling distribution of the mean, which, it will be 8/140
recalled, tells us how sample averages vary in
repeated sampling. This distribution, as was seen in giving a significance level of a = 0.117 (11.1%).
Thus, roughly one in every nine samples would have
Section 6.4, tends to be normal, ,with mean 5 ~~O a mean as large as 261.5 when the null hypothl~sis
was true. This is not therefore a very unusua·. event,
(if HO is true) and standard devlat,on 0/ In; ,t ,s and, ~eing properly cautious (as a jury would tend to
shown in Fig. 8.1. be), ~e should probably not regard such a sample
result as sufficient evidence to throw reason ble
--~ doubt on the nu 11 hypo thes is. We shou 1d thereiore be
inclin~d to accept HO' and the sample result would be
Fig. 8.1 The sampling distribution of x when said to be'not significant'.
HOi s true
Since HO was accepted when a = 11.7% and rejected
On this diagram, the sample mean ~O actually
observed is marked. Now, the greater the difference when a = 1%, there must be a value of a which is a
between xO and the hypothetical mean ~O' the greater
would be our doubt about the truth of the null dividing line between acceptance and rejection. A
hypothes is. convention has developed over many years to regard a
Therefore we define the significance level a to be significance level of 5% as the point at which the
the probability of getting" sample average as 0reat evidence against HO becomes sufficient to reject the
as, Or greater than, "0' This orobability is null hypothesis. However, it is best to regard this
represented by the shaded area in Fig. 8.1. simply as a guide to the interpretation of a
To find the value of a, we use the methods of significance test and not as a strict rule. The
Section 5.6 and calculate only r,al difficulty occurs when the significance
level is close to 5%; there is, after all, very
xo ~O little difference between a sample whose significance
v, in level is 6% and one whose level is 5%. In cases like
this, it is best to get more evidence. in the form
of a larger sample, before reaching a definite
conclu;ion.

8.2.5 Single-tail and Double-tail Tests

In Exa'.lple 8.1, we were interested only in testing
whethe~ the new finishing procedure had been an
improvement over the old one, i.e., whether its use
had in:reased the mean finished mass of the garments,
because this is why the new procedure had been
introduced. (If there had been an indication in the
shape of a sample mean below 260 that the new

==- ---')~r 6i_

orocedure carl not effec ted an improvement, there
~ould have been no point in doing a significance
test.) Tests of significance in such si tuations a-e
called single-sjdcd, or sinJlc-t~, tests, lIw

latter nulilC ld~i:)illg uccrJuse the significance level n and then find the corres pond in9 ta iI a rea t rom
Tabl0 AI. Hov,~ver, the unknown populatiol standard
is calculated by consldcl-ing only the Mca in one deviation 0 appears in this equation. The only
information we have concerning its value is contained
tail of the sampling dlstribution shown in Fig_ 8.1. in the sample estimate s ~ 0.79. If the sample size
is greater than about 30, it is reasonable tJ replace
Now consider the following. a by s; doing this, we find

Example 8.2 31.13-32.0
0.79/119;'-
The data of Table 2.3 are the results of 192 count
tests on a large delivery 'of worsted yarn. Their mean the minus sign merely indicating that we are dealing
and standard deviation were calculated in Table 3.3, with a left-hand tail area. This value of U is so
where it was sh'l;m that xO ~ 31.13 tex, ~ ~ J.79 tex. large that it is well outside the range of Table AI.
The nominal linear density of the yarn in the We therefore conclude that, if the null hypothesis
delivery was 3, tex, and the question arises as to is true, the chance of getting a sample of size 192
whether the mean of the delivery was different from with a mean of 31.13 is Virtually zero. The evidence
the nominal. against the null hypothesis is thus overwhelming,
and we haVE no hesitation in rejecting it and
In this case, we would be interested in detectin9 concluding that the mean count of the whole yarn
whether the population mean was either higher or consignment is not equal to the nominal 32 tex.
lower than the nomi na I, i, e., we are concerned wi th
deviations from t le nominal in both directions, This ***
has an effect on the alternative hypothesis. The null
hypothesis is the same as before, namely, that the :LIY
population mean is equal to the nominal or standard 31. 13
value \.lQ, i.e.,
Fig. 8.2 Sampling distribution of x in
However, the alternative now is that the population samp 1es of . ;ze 192 when HO is true
mean is not equal to the standard, i.e.,
8,2.6 Statistical and Practical Significance
and this should be compared with the HI for a single-
tail test, given in Relation (8.1b). The situation T;e cOllclusion reached in Example 8.2 provides an
specified by Relations (8.2a) and (8.2b) is dealt opportunity to discuss the distinction between
with by a two-sided or double-tail test of statistical and practical significance. It was
significance. concluded that the mean count of the yarn consignment
d;ffered from the nominal 32 tex. But this is only
Example 8.2 half the story, for, having reached this conclusion,
For Example 8.2, we have o~e ,would be interested to know hoVi big the
dl?viation from the nominal actually is. Our be:;t
The genercl procedure is exactly as before, i.e., we po'int estimate of the difference is given by
begin by considering the sampling distribution of
means of samples of size 192 and assuming the null and a difference of this magnitude may! e of no
hypothesis is true. This distribution is shown in .P.'::.acticailmportance whatsoever. Whethe,' or not the
Fig. 8.2. Marked on this diagram is the observed difference is practically s~gnificant i; not a
sample mean xO ~ 31.13. Reasoning as before, any stJtistical problem but a. technologi'cal and/or
sample mean less than 31.13, represented by the left- CO'llllercialquestion, whose answer may dEpend, f'Jr
hand tail of the sampling distribution, would deviate ex"mp~e, on what the yarn is to be used for, how
even farther from the nominal than XO' and would thus large the consignment is, and, perhaps, the price of
be ~ven stronger evidence against HO than XO is. In the yarn.
add,t,on, becaus~ deviations above and below the
nominal are now equally important, just as strong A careful distinction must therefore be made between
evidence against HO (and hence in favour of HI) would practical significance and statistical significance.
be provided by sample means> 32.87 in the right-hand Any departure from a null hypothesis, however small,
tail of the distribution, which differ from the can be detected and found statistically significant
nominal just as much as those in the left-hand tail. provlded that a large enough sample is examined, but
We therefore dcfine the significance level a to be sucn extremely small differences would usuolly be
the probability of getting a sample mean that differs quite insignificant from a practical point of view.
from the nomi l1aI by as much as or more than, XO does, Ideally, the sample size should be chosen n such a
and thlS lS glven by the sum of the two tail areas way that there is a high probability of detecting as
shown shaded in Fig. 8.2. Because the diagram is statistically significant a departure from a null
symmetrical about ~, the two tail areas are equal, hYP8thesis that is practically important. "his
and we need calculate only one of them. To do this ther,leis carried on in the next section.
we evaluate

~~~ Choice of Sample Size

,nnelusions are reacned as a result of a
wS~iegnn'ificance tes t , t_":ece 1S a POSS1. b'1I1.tY that
errors may be made. rOr example. a null hypothesis
that is true cJn be reJI:cted because J pC~Slmlstlc

ple happens to have been drawn, This 15 called an
~~~or c.Ltl!I'J".i!';t_k!llHll'd. the pl'nhJhiIity uf
~ such a/len'or '5, equ,\! to the 51gn'fic.1nce

level o. Thus~ 111 t>:ilfll\:lt' B.J,.wt~cn.h'C reJccted thp

u11 hypothes1s that tilene" f1/1\sl\\n9procedu,'e had
:ade nO difference, the probability that this.
conclusion was wrong "as 1= a.a039, very 10'.' 1ndeed,

There 1$ another .CfTor lhJt CJn b0. llIJ.de: J nul J

nypothesis that 1S false can be accepted becauss an
cDti1istic ~1mple happens to have been examined.
This is called an error of the second kind, and the
probability ~ of making it will depend on how
different from the null hypothesis th~ truth actuall)

is.

As an example of this, suppose we ~re testing the
null hypothesis

and using a sample of size n. (This is the Fig. 8.3 Sampling distributions of the mean
situation of Example 8. I) Suppose further that it has
a) when HH,O is true
been decided to reject Ha whenever the significance b) when is true

level a falls below 5%. When this has been done, ) Despite this, the concept of an error of the second
critical value Xc for the sample mean can be found kind and the associated probability of making such an
which is such tha t: error ar~ useful ideas in helping to decide what size
of samplE is needed in a particular experiment. It
(i) if the observed mean xo is :> Xc, then Ha is was statcd at the end of the previous section that
the sample size should be chosen so that there is a
rejected; high probability that the experiment will detect a
practically significant departure from a null
(ii) if the observed mean xa is < xc' then Ha is hypothesis. We are now in a position to make this
more preci se.
accepted.
We shall use the data of Example 8.1 to illustrate
(This strict interpretation of the critical value the approach. It will be recalled that using the
contradicts the advice given earlier ~o reg~rd the origin,l finist.ing procLdure resulted in gam.ents
significance l"vel of 5~ merely as a guide and not as whose finished masses had a mean of 260 g and a
a strict rule. For the present purpose, howe"er, it standard deviatipn of 8 g. Now suppose the new
is convenient to adopt the precise rule.) Fig. 8.3(a) process would be worth adopting permanently if the
shows the sampling distribution of x when na is mean finished mass were increased to 262 g, i.e.,
true. The value of ~c is such that the probaBility that an increase of 2 g was considered technologically
0f rejecting Ha when it is true is 5%; consequently, and conme,·cially important. Now there is always a
the area to the right of Xc is 0.05. The U value chance that, even if such an increase did occur, a
corresponding to this area, frund from Table A2, is single experiment would not detect it. So let us
UO•05 = 1.64, so that impose the requirement that the experiment must te
designed;o as to have a 90% chance of detecting a
Now imagine the null hypothesis is rot true; in fact, practically significant improvement. Putting this the
let the true mean of the population be u = u, , where other way round, we are willing to take a 10% chance
u, ) ua. In practice, of course, we would not know of not detecting a practically significant increase in
the null hypothesis was false and would continue to finished mass, i.e., of making an error of the second

reject Or accept Ha accQl'ding to whether ;:a ;> Xc or kind when u = 262.
xO < xc. Fiq. 8.3(b) shews the sampling distribution
of x when u = u" from which it can be seen that

there is a cel't~in chanc~ that the observed mean ~O
wl11 11e below Xc' thus leadlng to a wrong concluslon.
The probability of this happening is 8, whose value
1s equ~ 1 to the shaded area to the 1eft of Xc in
Flg. 8.3(b). Thus, when the value of u, is known, the
value of B can be calculated. The s,lag is that u, is
not known (if it WdS, ther" would be no need to do a
significance test:), so that, in general, the value
of B cannot be found.

c

8. Some Standard Significance Tests

degrees of freedom. This test is strictly valid only

if the basic distribution of x is normal) but it hdS

been sho"n that quite large departures frolllnorlllality

can be tolerated before wrong concl"sions are dra"n.

for a sample of size n. The value of U co 'responding to Example 8.3
a tai, area of 0.10 is, from Table A2, UO.l0 ~ 1.28.
Therefore, using Fig. a.3(b), we see that Rc is also A finisher buys a cer in chemical on the
given by understanding that it snould not contain more than
3% impurities. When a batch is delivered, five
independent determinations of percentage impurity
are made, with the following results

~ ..'62 - 10.24/m . (8.5) The mean of these results is 3.12%. Is this
(8.4) and sufficient evidence to conclude that the batch
Equating the values of Xc in Equations contains too much impurity?
(8.5) gives
In this example, the null hypothesis is that the
batch satisfies the specification, i.e., that the
average impurity of the whole batch is not greater
than 3%:

260 + 13.12 262 _ 10.24
Irl Irl

The alternative is that the batch c)ntains too much
impuri ty:

Hence a sample of at least 137 garments would be The form of HO is not quite the sallieas the null
needed to fulfil the requirements laid down for the hypothes.es in Section 8.2 and is not, in fact,
experiment. precise enough to enable a significarce test to be
The procedure can be generalized and the sample size carried out. To remedy this, we procEed as though
for this type of experiment calculated from the null hypothesis were

n ~ ,@a+US)2o' which is precise, and similar to prev'ous null
(u 1 -~IO) 2 hypotheses.
From the data, the following can be calculated

8.3 Test for a Single Mean: Small Sample Available (3.12-3.00)~ _ 1.24 ,
The significance test described in Section 8.2 is 0.217
appropl'iate either when the value of the population
standard deviation a is known from past experience with k ~ n - 1 ~ 4 degrees of freedom. To find the
(as in Example 8.1) or when the sample is large
.enough to provide a sufficiently good estimate, s, of significance level a, this value has to be compared
a (as in Example 8.2). The latter condition is usually
satisfied I'lhen the sumple size is greater than about with the t values in Table A4 for 4 degrees of
30.
When the sample size is smaller than this, the sample freedom. From the table we find that a ~ 0.05. when
estimate s may be rather imprecise and, as in Section
7.4, allowance has to be made for the error that t = 2.13. Since to is less than this, we conclude
might be introduced into the calculation of a by
using s instead of a. The modification to the test is that a > 0.05 (i .e., > 5%). This is higher than the
straightforward: on~ simply uses the t"ble of t
values instead of normal-distribution tables. Thus we critical value of 5%, and hence the sample result
fi nd
is not significant. We therefore accept ~he null
and compare this observed value with the tabulated
values for hypothesis and conclude that there is insufficient

evidence to claim that the batch of chemical

contains too much impurity. ***

In Example 8.3, the test of significance was single-
Sided, since we were concerned only to discover
whether the ba tch conta ined too much im~ Jrity. If a
double-tail test is being carried out, the

calculation of to proceeds exactly as above. This is
then compared with Table A4, and the probabi'ity
value found from the tables is then douhle~ to give
the significance level.

~4 Exam~
It was explained in Section 8.1 that, according to
a certain theory of continuous-filament yarn hreakage, In a department within a factory, new operatives are
the ratio usually trained by Method A. A new training
procedure, Method B, has been proposed and to test
would I)e expec:ed to have a mean value equal to 1 ;00, how it compares with the standard method the
S being the yarn tenaclty, Sf the fl1ament tenaclty, following experiment was carried out. A new intake
a~d G the yarn twist angle. To test this theory, of 19 operatives was randomly divided into two
twelve yarns were spun, and the z values for the groups. The first group of 10 students was trained
individual yarns were as follows. by the standard method, the other group of 9 students
beir.g trained by the new method. After training, the
0.98, 1.05, 1.01, 0.93, 0.92, 0.88, operatives were given a test, and the marks
0.96, 0.95, 0.90, 0.94, 0.99, 1.02. (percentages) obtained were as follows.

The null hypothesis is that, on average, the Method A: 81, 83, 90, 74, 79, 82, 84, 78, 86, 89.
~heory correctly predicts the yarn tenacity, :.e., Method B: 81, 81, 84, 90, 86, 91, 88, 85, 88.

The situation is therefore tW1-sided, since we are We ,re interested to know whether these results
interested i~ detecti1g deviations from HO in both
directions, '.e., in whether the theory overestimates indicate chat the ne', method is an improvement. on
or underestimates yarn tenacity. The value of to is
given by the standard. **.

This is a specific example of a general problem,
which may be stated as follows. Suppose two

populations have unknown means VI and v" and that

we wish to test the null hypothesis that their

difference is equal to a specified value us' i.e.,

In n,ost practical situations, we wish to test whether
or not the means are equal, in which case us = 0, but
any other value could be specified. The alternative
hypothesis is either the single-sided

(0 .96 -1 .00 )If2 So far as Example 8.5 is concerned, Vl could be taken
0.05 as the mean of the population of operatives trained
by 11ethod A, u, being the mean of the population
with k = n - 1 = 11 degrees of freedom. The minus trained by Method B. We are testing the null
si9n arises because iO happens to lie in the left- hypothesis that the new method is no improvement
ove!' the old, i.e.,
zhand tail of the sampling distribution of and can
Suppose now that independent samples of size n, and
be ignored in entering Table A4 to find the n, are randomly chosen from the populations and that

significance level. their means are)(" and )(". Then )('0-:<" is a point

From this table, we find that t = 2.72 corresponds estimate of the difference u,- U2'
Proceeding as usual, we consider the sampling
to a single-tail area of 0.01, and therefore to a dis,ribution of the point estimate when the null.
b1P}thesis is true. This sampling distribution was
double-tail area of 0.02. Our vaJ~e of to is just given in Section 7.6 and is shown again in Fig. 8.4.

greater (in absolute magnitude) than 2.72; hence the

significance level for the experiment is just under

0.02, or 2%. We emphasize again what this means; it

is the probability of getting a mean Z ratio that

d~viates from 1.00 by as much as 0.04 if Ho is true.

Since it is quite a low probability, the experiment

throws reasonable doubt on the null hypothesis, which

is therefore rejected. We conclude that, since

70 < 1.00, the theory tends to underestimate yarn

tenacity. As always, however, consideration must be

given to the actual ma9nitude of the deviation from

the null hypothesis. In this example, the average

error of prediction is about 4%, and errors of this

magnitude may well be acceptable in many practical

appl ications. In other words, the deviation may not

be practically significant. .**

L----

---8. Some Standard Significance Tests

/~~, 2 and this is used in EquJtion (8.10) to replace both
02
~.i.III 01 and o~. To compensate for the possible error
n~~
,ntroduced by using an estimate calcu1at.ed from Sill1<1)

',JnIple" the flormal variate LJ is replaced by t; thu,
\IC ca h:ul d te

The sampling distribution of El~mple 8,5 (continued)
i1 - i2 when HO is true
Oenoting the marks obtained by operatives trained by
The observ.f2 d iffel-ence XI Q - xzo is aIso shovln on v:ing Methods A and 8 by xl and x2' respectively,
W8 have from the data
this diagram, and the signIficance level. is given
by the tail area outside this value. The actual 9 x 24.49 + 8 x 13.00
calculation of u now depends o~ the size of the 10+,9-2
sample available.
so that· s
8.4.1 The Case of Large Samples yields

Since the above sampling distribution tends to be 82.6 - 86.0 - 0
normal, the value of a can be found by computing
!n4 • 3 f +~
(XlO-X" )-~s
with k = 10 + 9 - 2 = 17 degrees of freedom. The
0- - minus sign merely indicates that we are dealing with
XI-X2 a left-hand tai 1 area and can be ignored in compar'ing
to "ith the values in Table A4. From this ':ab1e we
(XlO-X" )-"s fiLd that, for 17 degrees of freedom, t = ,,73 when
a = 0.05 (5%). Our value is close to this, so the
/oi + a~ exp"riment significance level is very close to the
01 02 critical value of 5%. Before reaching. a definite
conclusion, therefore, it would be advisable to
~no comparing the resulting value with norma1- gat~p.r more evidence by training more neVI operatives
distribution tables. Unfortunately, this calculation by :he new method.
requires a knowledge of the unknown population
varidnces oT a~d o~. If the samples are reasonably Matched Sample>
large (say nl and n2 > about 31), ot ~nd o~ may be
replacec: by their sample estimates st and s~. The analysis described in the last section is
Equation (8.10) then becomes appropriate whe" the two samples available for the
test are chosen independently from the populations
XIO-XZO-ll who"e means are being compared. In some investiga-
S tioes, however, it is advisable to plan the experiment
so that the samples are matched pairwise, as in the
-l + ~l fol'''lwingexample, and then a different method of
analysis is needed.
01 nz

When U has been obtained by using this equation, the
value of a can be found from Table AI.

8.4.2 The Case of Small Samples

Much the more con¥11onsitua tion ari ses, however, when
the sample sizes 'are small, i.e., fewer than about
30. Simply rep1acir'" at and o~ by estimates
calculated from such small samples could lead to
considerable errors in the significance level. In
order to proceed in these circumstances, it is
necessary to make the same assumptions as were made
when we were dealing with the corresponding
confidence limits in Section 7.6. These were that
the two popu 1at ions are norm;ll and tha t thei r
variances are equal. However, quite large departures
from these condition, can be tolerated before
drastically wrong conclusions are drawn from the
data.

Since the population variances are assumed equal,
an estimate of thi, common variance is given by
Equation (7.12), i.e.,

(n,-1 )s; + (n,-1 )sl
201+02 -

~ s' = £(d-d)' = 2.076

An experiment was carri~d out to compare two methods d n-\ 9
of extracting grease from wool: To test the methods
-s;;- -to - (d0-O)fr1_ 0.38/":"0
nder a wide varlety of condltlons, ten dlfferent
~oolS, which were expected to have widely varying ----0:48
amounts of grease in. the"" were chosen as experimen-
tal m"terial. A speclmen was chosen at random from
each wool batch, and this was divided into two equal
parts. The parts were then randomly assigned to the
methods of extraction. The amounts of grease
extracted (expressed as percentages of the original
weight) are shQ'.oInin Table 8.1,

l M~thod_I 8.2 0.8 with K = n - 1 = 9 degrees of freedom.
Woo type_ 8.4 0.2
9.0 Compa~ing this value with Table A4, we see that it
'n-I 8.6 14.0 0.6 corresponds to a single-tail area between 0.025 ani
14.6 13.8 1.0 0.01. Since the test in this example is two-sided,
A these probabilities need to be doubled to Jive the
B signi'icance level a. Thus 0.05 <a< 0.02. This
signi~icance level is below the critical value of
~ ;~:~C 0.05 (5%); the null hypothesis is therefore rejected,
lJt 15.9 15.6 -0.2 and ·w€ conclude that the two methods of extraction
F 16.8 16.2 -0.3 are not equally effective. In fact, Method I on
G average removed 0.381 more grease than Method II. ***
16.9 16.3 0.5
H. 21.9 17. I -0.2
J 22.6 21. 4 0.5
K· 21. 7 0.9

The null hypothesis in this case is that on average 8.6 T~st. for Sin?le Variance
the two methods are equally effective in removing
grease from wool, while the alternative is that they Sometimes the object of an experiment is t(, discover
are not equally effective. Thus whet~er or not a variance has changed from a
prev·:.)usly known standard value. This situi.tion
The test is two-sided because, in the absence of might arise. for example. if a modification is made
prior kno~l1edge, we do not know which method, if any, to a :nanufactu ..ing process with the aim of improving
would be the more effective. This hypothesis could be its variability. Thus we wish to test the null
tested by using the method of Section 8.4, but such hypotnesis
an analysis would tend to be insensitive, since the
variance S2 calculated from Equation (8.11) would where o~ is a previously known value. agains': the
reflect the variation in grease content among the single-sided alternative
wools that was deliberately introduced into the
experiment to make it more general. The primary which would be appropriate in the example quoted
interest is in the differences in grease content above, or against the two-sided alternative
produced by the two methods; these differences are
shown in Table 8.1. The null hypothesis that the two Suppcse a sample of size n is available and that the
m .thods a-e equally effective is equivalent to variance calculated from the sample is sa. Then the
apprcpriate test is based on the result. first given
where ud is the population mean difference; the in Section 6.5, that the quantity (n-l) S2/02 has a
alternative hYrothesis is sampiing distribution like that of x2 with k = n - 1
degrees of freedom, provided that the original
The appropriate test is therefore one for a single distribution is normal. We shall make this assumption
mean, and the method of Section 8.3 applies. Letting in what follows; unfortunately the test is seriously
d denote the individual differences, we find that affected if it is not justified in the data being
cons.dered. The test simply consists in calculating
£(d-d)' £d'-(£d)'/n an observed value of ~2 i.e.,

3.52-(3.8)' /10 which can then be compared with the values tabulated
2.076 in Table A3,

~[ . 6_7

~h two lllcthods of grcl.1se '. IlICntl0llCO in 8.1(( ,t foo' the Oi fference b.:tween Two Voriilnces

extrJCllOn Another frequently occurring problelll drises when we
wish tn compJre the variilnces, ul ilnd u~, of two
Of t~, J u Hdhud II I1J' " ~tand''''d tech"ique, ""d pOpUldt.iollS, i.e .• to lest the null hypothesis

[X"":~i~nL~ in usi,,~ ',d shUl1.1 thJt the ~t;\I,dard

d~eX\'P7laIt~nn.- of n.'IH~,lldele,'mindLiOfiS Ull the same ~'J()ul

SdlllIP",.'"dS 0.41',;. fleLhod I I1as a IH:Hly developedd .
chnique, Jnd, \'ihen It \-.'d5 used to make ten. etennl-

te ions of the grease content of a 1'1001 speClmen, the

T~OatI l'O.~""'-n' aresults "ere obtJined (% grease content)

H.~, 14.5,143, 14.U, 14.3, 14.6, 14.4, 14.8, This Cdn be done provided thilt the populdtions dre
normill ilnd thilt the samples d,'alln from the t"IO
14.5, 14.1 . popu Ia tions are independen t. The me thod of tes tis
It is required to test whether these data indicate based (II the variance ratio or F-distribution
that the variability of the new method is different introdllced in Section 1.9.
from that 01 the old. It will be recalled that, if the sample sizes are nj
and n2 ilnd the samp12 variances ilre sfo and slo'
In this example, the null hypothesis is that the ne" then
method has the same variability as the old, i.e.,
~/5:1

S~i()~

hds t.he F-distribution \'ith k} = n,-l and k2 ; n -1

2
degre~s of freedom. It HO is true, then this reduces

to

The two-sided alternative is taken because, when the and this value is compared with those in Table AS to
new method was invented, it was not known whether it find th" significance level of the experiment.
would be less variable or more variable than the old.
If the objective of producing the new method had been Example 8.8
specifically to reduce the variability, then a one-
sided alternative would have been appropriate. The Ust,r Evenness Tester can be used to count the
number uf neps in 1000 m of yarn, and the variation
From the data, I1e find that s~ ; 0.04622 and in this count along the yarn can be an importan~
therefore factOr in controlling the spinning process. To
compare the performance of two <oinning machines,
Xo (n-1 )Soloo the nep counts of several rando<l. chosen 1000-m
lengths of yarn were found and gave the following
9 x 0.04622 results.
0.412

with k ; 10 - I ; 9 degrees of freedom.

Because s~ is smaller than o~. our value of X~. lies The number of neps per 1000 m of yarn would be
ln the left-hand tail of the x2 distribution. We expectec to vary according to a Poisson distr'ibution
if they "ere occurri ng a t random. However, since the
therefore find the probability of getting a value of number .ill usually be fairly large, the Poisson
distribution can be approximated to by a normal
x' ~ than that observed in the experiment. (If So distribution, as mentioned in Section 5.8, so that
the condition following Equation (8.14) above is
00had been greater than we should have found the satisfied. In this example, the alternative
hypothesis is two-sided, i.e.,
probabi Ii ty of getting x~ ).X5.)

xF2rom; Table A3, we see that, for 9 degrees of freedom,
2.70 corresponds to a probability of 0.975.

HOllever, this is the probability that x2 is greater

than 2.7; therefore the probability that x2 < 2.7

is 1 - 0.975 ; 0.025. Our value of x~ is just 12SS
than 2.7 so that

Because we are dealing with a two-sided alternative, because, before the data were obtained, it I1as not
this probability must be doubled to give the known I1hich of the two machines I10uld exhibit the
significance level D, giving greater yariability.

Our result is therefOre significant at the 5% level, To facilitate the test, FO is alwilys computed so
that its value is greater than unity, i.e.,
and the null hypothesis is rejected. We conclude
lar~er-sample variance.
that the variability exhibited by the two methods is smaller-sample variance

00different. In fact, since So < the indication is

that the new method is probably less variable than

the old. ***

~ n~,- 1 - 9 d('~lre('~ of The significJnce level for the experiment is the
,-(, UlJJl <;>0' we would
.,. \ t~o','<i r Jb I e A5 shows probability uf getting Xo or more occurrences when
~ brobability of 0.05
the null hYPQthesis is true, and this probability is
douoling this to get the significance level equal to the tail area shown shaded in Fig. B.S. To
anO, use we are dealing with a two-sided test), we find this arra we calculate
Ibeca
find u ' (xO·u)/"

a' 0.10 (10%). xO·npO
This is greater than the critical value of 5%, and )iPO( I-re)
the data therefore do not discredit the null J.nd refer tllf':;
1~thesis. The experiment does not provide enough
evidence to cone 1ude tha tone mach ine produces a m~:: Example 8.9
variable rep count than the 0ther.
~ Test for a Single Proportion During the process of winding yarn onto bobbins, the
Suppose we wish to test the null hypothesis that the weight of yarn actually wound will vary slightly fr~n
~oportion p' of tImes that an event occurs In bobbin to bohbin. If t~e weight falls below a certain
independent trials is equal to a specified value PO, level, the b0bbin is counted as a 'defective'. When
i.e., to test working norm?lly, the process is known from past
experience tu produce, on average, 3% defectives. As
Suppose a sample of size n has been examined and xo part of a qu,.lity-control check, 400 bobbins were
occurrences of the event we"e obse. ved. If the null weighed, and 20 were found to be underweight. Is this
hypothesis is true, the number of occurrences of the sufficient e,idence to conclude that the process is
event in n independent trials would be expected to behaving worse than usual'
vary according to a binomial distribution with mean
In this example, n.= 400, which is certainly large
as explained in Section 5.4. Further, if the sample enough for the normal approximate to apply,
size is large enough, we know (Section 5.8) that the P0 = 0.03 and xO = 20. Therefore Equation (B.15)
binomial can be a~proximated to by a normal gives
~stribution with the same mean and variance.
Therefore, under these condi tions, the sampl ing 20 - 400 x 0.03
distribution of the number of occurrences will be as
shown in Fig. 8.5. )400 x 0.03 x 0.97

Fig. 8.5 Sampling distribution of number of Entering Table Al with this value of U, we find that
occurrences in samples of size n ~he significance level is
when HO is true
or about 1%. This is a very low probability,

indicating that it is very unlikely that we should

get as many as 20 defectives in 400 bobbins if the

winding process were behaving normally. We therefore

conclude that the process has deteriorated for some

reason. ***

The test in Example 8.9 was single-sided. If a double-
ta i1 tes tis appropri ate, the tes t procedure is as
described ilbove, eAcept tha" the probability obtained
from Table A1 must be doubled to give the significance
level Q.

1. It is specified that the impurity of a chemical
must not exceed 1.35%. On a certain batch of the
chemical, five determinations of percentage impurity
were made:

Did the batc~ contain si~nificantly more impurity
than the specified level?

2. It is suseected that a delivery of {arn has a
regain greater than the standard of 18.%. To .test
this, eight l~gain tests were carried out, with the
following results:

3. Two spinning machines were nom1nally spinning the ANSHC <5 TO PROBLEMS
me count of yarn. F'Ve bobb1ns were chosen at
Chilp~~
~:ndom from each frame and the 11near density (111
tex) of each was found:

I. 5.95 mini 0.31 Olin
0.61 mm
!. 30.0. 30.3. 30.5. 30.8. 31.0. 2. 26.71 mm; 0.17 tex
2: 30.5. 30.8. 31.0. 31.2, :ll.3.
3. x 0 41. 2 tex; Sg
.34
On this evidence, were the frames spinning signifi- 4. x 607 g 1.9
cantly different counts? 5. 1.87
0.24 cm
4. The following are the results of extension tes's 6. x 3.7 0.82 ppm
0.26 sec
Jcarried out 01. two types of yarn (percentage 7. d.73 cm 1. 40
extension at break).

8. x 8.0 ppm

Yarn A: 14.1, 14.7, 15.1, 14.3, 15.6, 14.8. 9. 15.75 sec;
Yarn 8: 16.9, 16.3, 15.9, 15.7, 15.7.
10. x 1. 96

Do these results suggest that one yarn is signifi- Chapt~
cantly more extensible than the other?
1. o 273
5. Pieces of five different fabrics were each divided
2. (e) 0.815; (b) 0.170; (c) 0.015
into two parts, and each part was given one of two
3. o. ::6%
shrink-resist treatments. Tests for percentage area
4. (J) 1/495; (ii) 0.044 ; (i ii) 0.211; (iv) 0.98
shrinkage were then carried out with the following 5. (a) 1/12, 9/24, 5/12, 1/8; (b) 2

res" lts. .

Fabric d 2. (3 ) 2.5; (b) 0.72; (c) 0.038
Treatment A 4 3 10 3. (a) 0.161; (b) 2; (c) PrIx >6) ; 0.02
Trea tment B 4 68 4. (a) 0.45; (b) 0.19; (c) 0.81; (d) 0.018
5. £251.52 per box
Was one treatment significantly different from the '1,
other? 6. (a' No difference between A 'and B
tv"!''. '7 (0) A better than B
\I~ batch of fibres should nominally be a mixture
of black and white fibres in the ratio 2:J..--A sample 7. ~ 227 g; 0.77
. of 300 fibres was chosen at random from the batch 8. ~ 178 g; 0 ; 17 g

and found to contain 220 black fibres. Is this result 9. (a) 914 doz; (b) 0.115
consistent with the hypothesis that the batch contains 10. x 305
the COl'rect mix of .b1ack and white fibres?

7. The expected performance of a group of machines is
that they should operate for 80% of the available
time, the remaining 20% being scheduled for
maintenance and setting-up. It is believed that one
machine of the group is not achieving this target.
The machi ne was observed at 500 rando'" ins tants of
time and was found to be working on 370 occasions.
Do these results suggest that the machine was running
significantly below the target efficiency?

Chapter

8. A manufacturer produces articles and knows that 1. 4 4 ± 0.029; 4.4 0.038
usually not more than 5% of them will have a mass of
less than 100 g. When a large batch of articles was 2. 36.2 ± 1.1
ready for delivery to a customer, 200 articles were
selected at randcm and 16 of them had a mass of less 3. 60 0.6' 144
than 100 g. Does this indicate that the batch was not
4. 41.2 0.2; n ; 10
of the 'usual standard?
5. 605.9 ± 10.7 g

6. 0.7 ± 0.09; 2100

7. 0.052 ± 0.025

8. (6.4, 18,3)

Chapt~rJl.

1. No 5. No
2. Yes 6. No
3. No 7. Yes
4. Yes 8. Yes

Table Al: Areas in the Tail of the Standard Normal Distribution
The entries in thi5 table are val~es of

CL == Pr(u ). U)

for the given values of U

U a. U C". U a. U a. U --
--
0.00 0.60 0.2743 1.20 0.1151 1.80 0.0359 2.40 a.
0.02 0.5000 0.62 0.2676 1.22 0.1112 1.82 0.0344 2.42
0.04 0.4920 0.1075 1.84 0.0329 2.44 --
-0.06 0.4840 0.64 0.2611 1.24 0.1033 1.86 0.0314 2.46
O. OE~ 0.4761 0.66 0.2546 1.26 0.1003 1.88 0.0301 2.48 0.0082
0.10 0.4681 0.68 0.2483 1.28 0.0968 1.90 0.0287. 2.50 0.0078
0.12 0.4602 0.70 0.2420 1.30 0.0934 1.92 0.0274 0.0073
0.14 0.4522 0.72 0.2358 1.32 0.0gel 1.94 0.0262~ 0.0070
0.16 0.4443 0.74 0.2296 1.34 0.0869 0.0250,f I 0.0066
0.18 0.4364 0.76 0.2236 1.36 0.0838 T.96 0.0239 0.0062
0.20 0.4286 0.78 0.2177 1.38 0.08G8 1.98 0.0228 2.60
0.22 0.4207 0.80 0.2119 1.40 0.0778 0.0217 0.0047
0.24 0.4129 0.07~9 2.00 0.0207
0.26 0.4052 0.82 0.2061 1.42 0.0721 2.02 0.0197 0.003.4
0.28 0.3974 0.84 0.2005 1.44 0.0694 2.04 0.0188
0.30 0.3897 0.86 0.1949 1.46 0.0668 2.06 0.0179 2.70 0.0026
0.32 0.3821 0.88 0.1894 1.48 0.0643 2.08 0.0170
0.34 0.3745 0.90 0.1841 1.50 0.0618 2.10 0.0162 0.0019
0.36 0.3669 0.92 0.1788 1.52 0.0594 2.12 0.015'
0.38 0.3594 0.94 0.1736 1.54 0.05:'1 2.14 0.0146 ,
0.3520 0.96 0.1685 1.56 0.054.8 2.16 0.0139 2.80
O.~O 0.3446 0.98 0.1635 1.58 2.18 0.0132 0.0014 I
0.3372 1.00 0.1587-:::1.60 0.05i:6 2.20 0.0126
0.42 0.3300 1.02 0.1539 1.62 2.22 0.0119
0.44 0.3228 1.04 0.1492 1.64 0.0505 2.24 0.0113
0.46 0.3156 1.06 0.1446< 1.66 O.04~35 2.26 0.0107 2.90
0.48 0.3085 0.0465 2.28 0.0102
0.50 0.3015 1.08 0.1401 1.68 0.OH6 2.30 0.0096.
0.52 0.2946 1.10' 0.1357 1.70 0.04'27 2.32 0.0091
0.54 0.2877 1.12 0.1314 1.72 0.0409 2.34 0.0086
0.56 0.2810 1.14 O.1271 1.74 0.0392 2.36 0.0082 3.00
0.58 0.2743 1.16 0.1230 1.76 0.0375 2.38
0.60 1.18 0.1190 1.78 0.0359 2.40
1.20 0.1151 1.80

The tables in these Appendices have been extracted from 'Biometrika

Tables for Statisticians', and are reproduced cy permission of the

Biometrika Trustees.

1'-------'-_. =-:J ~. ~ . L__- ._------

--------------------------------r- !
,-
,

\

Table A2: Probability Points of the StQndard Normal
Distribution

The entries in this table are values

of U corresponding to the given values ~

of a. ~

r a. U a. U a. U

00500 0.0000 0.034 1.8250 0.015 201701
0.1257 0.032 1 .8522 0.014 2.1973
I 0.450 0.2533 0.030 1.8808 0.013 2.2262
0.400 0.3853 0.029 1 .8957 0.012 2.2571
0.350 0.5244 0.028 1 ~91l0 0.011 2.2904
0.3CO 0.6745 0.027 1 .9268 0.010 2.3263
0.8416 0.026 1.9431 0.009 2.3656
I 0.250 1.0364 0.025 1.9600 0.008 2.4089
0.200 1 .2816 0.024 1.9774 0.007 2.4573
0.150 1 .6449 0.023 1.9954 0.006 2.5121
0.100 1.6646 0.022 2.0141 0.005 2.5758
0.050 1 .6849 0.021 2.0335 0.004 2.6521
0.048 1.7060 0.020 2.0537 0.003 2.7478
0.046 1.7279 0.019 2.0749 0.002 2.8782
0.044 1.7507 0.018 2.0969 0.001 3.0902
0.017 2.1201 0.0005 3.2905
I 1.77fJ4 0.016 2.1444

0.042 1.7991
0.040
0.038
0.036

Table A3: Probability Points of the X2 Distribution

The entries in this table are
the values of Xk,a
corresponding to the
given values of a

k 0.995 0.975 a 0.025 0.005

1 0.000 0.001 0.950 O. OliO 5.024 7.879
2 0.010 0.051 7.37S 10.6e
3 0.072 0.216 - 9.348 12.84
4 0.207 0.484 11 14 14.86
5 0.412 0.831 0.004 3.841 12.83 16.75
6 0.676 1.237 0.103 5.991 14.45 18.55
7 0.989 1.690 0.352 7.815 16.01 20.28
8 1.344 2.180 0.711 9.488 17.53 21.96
9 1.735 2.700 1.145 11 .07 19.02 23.59
10 2.156 3.247 1.635 12.59 20.48 25.19
12 3.074 4.404 2.167 14.07 23.34 28.30
14 4.075 5.629 2.733 15.51 26.12 31.32
16 5.142 6.908 3.325 16.92 28.85 34.27
18 6.265 8.231 31.53 37.16
20 7.434 9.591 3.-940 18.31 34.17 40.00
25 10.52 13 .12 40.65 46.93
30 13.79 16.79 5.226 21. 03 46.98 53.67
40 20.71 24.43 6.571 23.68 59.34 66.77
50 27.99 32.36 7.962 26.30 71.42 79.49
9.390 . 28.87
10.85 31.41
14.61 37.65
18.49 43.77
26.51 55.76
34.76 67.50

~(,-- 7_3

The entries in this table are

2

the values of Xk,a
corresponding to the
given values of a

a

k 0.995 0.975 0.950 O. OLiO 0.025 0.005

-

1 0.000 0.001 0.004 3.841 5.024 7.879
2 0.010 O. 051 0.103 5.991 7.37S 10.6e
3 0.072 0.216 0.352 7.815 9.348 12.84
4 0.207 0.484 0.711 9.488 11 14 14.86
5 0.412 0.831 1.145 11 .07 12.83 16.75
6 0.676 1.237 1.635 12.59 14.45 18.55
7 0.989 1.690 2.167 14.07 16.01 20.28
8 1.344 2.180 2.733 15.51 17.53 21.96
9 1.735 2.700 3.325 16.92 19.02 23.59

10 2.156 3.247 3.-940 18.31 20.48 25.19

12 3.074 4.404 5.226 21.03 23.34 28.30
14 4.075 5.629 6.571 23.68 26.12 31.32
16 5.142 6.908 7.962 26.30 28.85 34.27
18 6.265 8.231 9.390 28.87 31.53 37.16
20 7.434 9.591 10.85 31.41 34.17 40.00
25 10.52 13.12 14.61 37.65 40.65 46.93
30 13.79 16.79 18.49 43.77 46.98 53.67
40 20.71 24.43 26.51 55.76 59.34 66.77
50 27.99 32.36 34.76 67.50 71.42 79.49

The entries in this table
are values of tk,a
corresponding to tne
given values of a

,..---- a t k,a.
k 0.005
0.05 0.025 0.01

1 6.31 12.70 31.80 63.70

2 2.92 4.30 6.97 9.93

3 2.35 3.18 4.54 5.84

4 2.13 2.78 - 3.75 4.60
5 2.02 3.37 4.03
2.57

6 1.94 2.45 3.14 3.71

7 1.90 2.37 :l.00 3.50

8 1.86 2.31 2.90 3.36

9 1.83 2.26 2.82 3.2S

10 1.81 2.23 :~.76 3.17

12 1.78 2.18 2.6e 3.06

14 1.76 2.15 2.62 2.98

16 1.75 2.12 2.58 2.92

18 1.73 2.10 2.55 2.88

20 1.73 2.09 L53 2.85

22 1.72 2.07 2.51 2.82

24 1.71 2.06 2.49 2.80

26 1.71 2.06 2.48 2.78

28 1.70 2.05 2.47 2.76

30 1.70 2.04 2.46 2.75

Table A5: Probability Points of the F Distribu
The entries in this table are valJes of Fk1k2a
to the given values of a
a = 0.05

k2 1 2 3 4 5

1 161 200 216 225 230
2 18.5 19.0 19.2 19.3 19.3
3 10. 1 9.3
4 7.7 9.6 6.6 9.1 9.0
5 6.6 6.9 6.4 ·6.3
5.8 5.4 5.2 5. 1
6 6.0
7 5.6 5. 1 4.8 4.5 4.4
8i 5.3 4.7 4.4 4. 1 4.0
9 5. 1 4.5 4.1 3.8 3.7
10 5.0 4.3 3.9 3.6 3.5
4. 1 3.7 3.5 3.3
15 4.5
3.7 3.3 3.1 2.9
20 4.4
3.5 3. 1 2.9 2.7
co 3.8
3.0 2.6 2.4 2.2

ution
corresponding

k1 7 8 9 10 20 .
6
239 241 242 248 co
234 237 19.4 19.4 19.4 19.5
19.3 19.4 8.9 8.8 8.8 8.7 254
8.9 8.9 6.0 6.0 6.0 5.8 19.5
6.2 6.1 4.8 4.8 4.7 4.6 8.5
5.0 4.9 5.6
4.2 4. 1 4.1 3.9 4.4
4.3 4.2 3.7 3.7 3.6 3.4
3.9 3.8 3.4 3.4 3.4 3.2 3.7
3.6 3.S 3.2 3.2 3. 1 2.9 3.2
3.4 3.3 3. 1 3.0 3.0 2.8 2.9
3.2 3. 1 2.7
2.6 2.6 2.5 2.3 2.5
2.8 2.7
2.5 2.4 2.4 2.1 2.1
2.6 2.5
1.9 1.9 1.8 1.6 1.8
2. 1 2.0
1.0

Table A5 (continued)

a = 0.025

k2 1 2 3 4 5 k1

1 648 800 864 900 922 6
2 38.5 39.0 39.2 39.3 39.3
3 17.4 16.0 15.4 15.1 14.9 937
4 12.2 10.7 10.0 39.3
5 10.0 9.6 9.4 14.7
8.4 7.8 7.4 7.2
6 8.8 9.2
7 8.1 7.3 6.6 6.2 6.0 7.0
8 7.6 6.5 5.9 5.5 5.3
9 7.2 6.1 5.4 5.1 4.8 5.8
10 6.9 5.7 5.1 4.7 4.5 5. 1
5.5 4.8 4.5 4.2 4.7
15 6.2 4.3
4.8 4.2 3.8 3.6 4.1
20 5.9
4.5 3.9 3.5 3.3 3.4
00 5.0
3.7 3. 1 2.8 2.6 3.1

2.4

1 8 9 10 20 00

7 957 963 969 993 1018
39.4 39.4 39.4 39.5 39.5
948 14.5 14.5 14.4 14.2 13.9
39.4
14.6 9.0 8.9 8.8 8.6 8.3
6.8 6.7 6.6 6.3 6.0
9.1
6.9 I 5.6 5.5 5.5 5.2 4.9
4.9 4.8 4.8 5I.l 4.1
5.7 4./f 4.4 4.3 4.0 3.7
5.0 4.1 4.0 4.0 3.7 3.3
4.5 3.9 3.8 3.7 3.4 3. 1
4.2
4.0 3.2 3.1 3.1 2.8 2.4

3.3 2.9 2.8 2.8 2.5 2.1
2.1 1.7
3.0 2.2 2.1 1.0 I

2.3

k2 1 2 3 4

1 4052 5000 5403 :'625

2 98.5 99.0 99.2 99.3

3 34.1 30.8 29.5 28.7
4 21. 2 18.0 16.7 16.0
5 ~6.3 13.3 12.1 11.4

6 13.8 10.9 9.8 9.2

7 1L.3 9.6 ~ • Jr 7.9

8 11.3 8.7 7.6 7.0

9 10.6 8.0 7.0 6.4

10 W.O 7.6 6.6 6.0

15 8.7 6.4 5.4 4.9
20 8.1 5.9 4.9 4.4
00 b.6 4.6 3.8 3.3

\ II

I

:jl

k1

5 6 7 8 9 10 20 00

5 5764 5859 5928 5982 6022 6056 6209 6366
99.3 99.3 99.4 99.4 99.4 99.4 99.5 99.5
28.2 27.9 27.7 27.5 27.4 27.2 26.7 26.1
15.5 15.2 15.0 14.8 14.7 14.6 14.0 13.5
11.0 10.7 10.5 10.3 10.2 10. 1 9.6 9.0

.8.8 8.5 8.3 8.1 8.0 7.9 7.4 6.9

7.5 7.2 7.0 6.8 6.7 6.6 6.2 5.1
6.6 6.4 6.2 6.0 5.9 5.8 5.4 4.9
6.1 5.8 5.6 5.5 5.4 5.3 4.8 4.3
5.6 5.4 5.2 5.1 4.9 4.9 4.4 3.9

4.6 4.3 4.1 4.0 3.9 3.9 3.4 2.9

4.1 3.9 3.7 3.6 3.5 3.4 2.9 2.4
3.0 2.8 2.6 2.5 2.4 2.3 1.9 1.0



The T~xtileInstitute
rvI311l1al of Textile Technology

,BN0 900739 525 Practical StatistiCS for
uality Control
ld Assessment the Textile Industry: '

~ries Part II .

GAV Leaf MSc DSc CText FTI FSS

Foreword by Professor CS Whewell BSc PhD
CChem FRSC CText FTI(Hon) CCo! FSDC

The Texti!e Rnstitute
10 Blackfriars Street
Manchester M3 SDR
England

C S Whcwell BSe PhD CChcm FRSC CText Ffl (I-Ion) Ceo] FSDC - Coordinator
P W Harrison BSe CText FfI MllnfSe

G H Crawshaw BSe PhD CChem FRSC CTcxt FfI

All rights reserved. No pil,t of this publiciltion milybe reprOl uced, stored in 3
retrieval syst~m, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, wilhoutthe prior
permission of the copy: ight owners.

The thorough treatment of quality-control methods The methods of statistical analysis described
given i~ previous issues in this series of pUbli- in this monograph build on the fundamental
cations emphaslZes t ,e importance of statistical ideas developed in Part I. The chapter numbers
techniques in assessing the validity of conclusions therefore continue the sequence started in the
drawn from technological data. It is, however, only ear I iel' vo It..ne.When reference is made to a
too easy to regard statistics as a collection of sect i01' in Part I. it is done in the form
formulae used to satisfy the academically oriented (I; 3.7), meaning Section 3.7 of Part I.
practitioners. Nothing could be further from toe
truth: the wrong use of statisti:al techniques can The general approach is that adopted pre-
br seriously misleading, and to minimize this pos- viously, namely, that the basic argument is
sibility, the fundamental princi~les underlying explained without too much mathematical detai I
statistics should be clearly understood. It was to and is then illustrated by examples drawn from
provide this backgroun1 knowledge that Dr. G.A V. textil~s. Such examples, if they are to be
Leaf, a distinguished statistician with a wide ex- realistic, must nrcessarily be taken from the
perience of using sta~istical methods, was invited author's experience and will therefore tend to
to write the present lIork. No previous knowledge of reflec: his interests, However, 1 hope they'
the subject is required, and it is amply illustrat- are sufficiently varied to appeal to readers
ed by examples taken from textile processing. It is with different backgrounds.
a useful addition to other well-established works,
and students, rcsc~rch workers, technologists, and I should like to take this opportunity of
consumers wi 11 find it IIscful. ,cor those new to the expressing my thanks to Mr. P.I'. Harrison of
subject, it wil I be a fascinating introduction, the Textile Institute and to Mr. K. Douglas
while others, who have perhaps become over-familiar and his sta·f at Zellweger Uster for their
with statistical techniques, will be reminded of very considerable and painstaking efforts,
the important fundamentals of the subject. The which were necessary to bring these books ~n-
carefully selected examples given in the text might to being.
even illustrate new applications of the discipline.
The publication deserves a wide and diverse reader-

ship.

The editor would like to acknowledge the as-
sistance given to him in the preparation of
the author's manuscript by Mrs. R. Ehrens-
perger and Mrs. F. Grebien who wrote the text,
Miss V. Hoffmann who prepared the figures and
Mr. E. Hirzel who made the layout.

9. 1 Intr'oduct; all
An Application of the Distribution
9.2 The General Procedure
9.3 Contingency Tables
9.4 The 2 x 2 Contingency Table
9.5 Subjective Tests: Ranks
9.6 Rank Correlation
9.7 Significance Test for R
9.8 Tied Ranks
9.9 The Coefficient of Concordance
9.10 Test of Significance for W
Coefficient of Concordance when Tied Ranks are Allowed
9. 11
Quality and Quality Control
9.12 Sampling Inspection
Acceptance Sampling: Attributes and Variables
10. The Operating Characteristic
The Choice of nand c
10.1 The Military Standard 1050 Sampling Scheme
10.2 Rectifying Inspection
The Dodge-Romig Tables
10.3 Acceptance Sampling by Variables: Assurance about a Minimum
10.4 (or Maximum) Value
10.5 Acceptance Sampling by Variables: Assurance about the
10.6 Mean Value
The Producer's-risk Condition (10.18)
10.7 The Producer's-risk Condition (10.19)
10.8
10.9

10.10

10.10.1
10.10.2

11 .1 Random and Assignable Variation
11.2 The General Principle of Control Charts
Action and Warning Limits
11 .3 The Interpretation of Control Charts
11 .4 Control Charts for Defectives
11.5 Control Charts for Defects
Control Charts for Averages and Ranges
11.6 ~ontrol Charts for Averages

11 .7 Control Chart for Ranges
Discussion of the Charts
11.7.1 Average Run Length
Choice of Sample Size
11 .7.2
Tolerances: Modified Control Limits for Averages
11 .7.3 Modified Control Limits for Average~
When to USl Modified Control Limits
11.8 Cusum Charts
The V tljask
11.9
Advantages and Disadvantages of Cusum Charts
11 .10
11. 10.1
11.10.2
11 .11
11.11.1
11.11.2

12. The Design of Experiments
Random Variation in Experiments
12.1 Randomization
12.1 .1 A Simple Comparative Exp?riment
12.1.2 A Model of the Data
The Test of Significance
12.2 The ANOVA Table
12.2. 1 Multiple Comparisons
12.2.2 Case (a) Comparison with a Control
12.2.3 Case (b) Global Comparisons: Tukey1s Procedure
12.2.4 Randomized-block Experiments
12.2.4.1 The Experimental Model
12.2.4.2 The ANOVA Table
12.3 The Tests of Significance
12.3. 1 (a) Differences among Treatments
12.3.2 (b) Differences among Blocks
12.3.3 Discussion of the Treatment Means in Example 12.2
12.3.3.1 Two-way Classification with Replications
12.3.3.2 The Model of the Data
12.3.4 The ANOVA Table
12.4
12.4.1 The Tests of Significance
12.4.2 Further Discussion of Example 12.3
12.4.3 2n Factorial Designs
12.4.4 The +, - Nota~ion
12.5 Yates's Algorithm
12.5.1 Fractional Replication
12.5.2 Esti~ation of Variances of Random Effects
12.5.3 The General Hierarchical Classificatior.
12.6
12.6.1 Standard Error of a Delivery Mean
The Economics of Routine Testi1g
12.6.2
12.6.3

13.1 Relations be:ween Variables 71
13.2 73
13.3 Fitting a Straight Line 74
13.4 Variation about the Regression Line 75
Confidence Limits 76
13.5 Analysis of Variance of Regression 77
The Correlation Coefficient 77
13 .6 The Interpretation of r 78
The Significance of r
13.6. 1 Regression through the Origin 79
13.6.2 Multip~e Regression
13.7 lhe Case of Two Independent Variables 80
13.8 The Normal Equation 80
13.8.1 Analysis of Variance 81
13.8.2 Confidence Limits 82
13.8.3 84

13 .8. 4 86

SOLUTIONS TO PROBLEMS 88
. 89
TABLE A1
TABLE A2 90
TABLE A3 91
TABLE A4 92
TABLE A5 95

TABLE A6

9. Analysis of Diskrctc and RJl1king Data

9. Ana~ysisof Diskrete and Ranl<ing Dat?!

In Chapter 1 of Pact I, it \Vas pointed out H, there are differences in stoppage rate
that numerical data are generated by the pro- among lh-e' machi nes.
cesses of counting, measuring against a contin-
uous scale, or ranking. Most of the methods of We begin, as alViays, by assuming the null hy-
analysis described in this book are concerned pothesis is true. If it is, Vie should expect
Vlith.measured variables. In this chapter, Vie that, on average, each machine Vlould stop in
consider some of the techniques appropriate equal number of times, i.e., the total of 149
for the analysis of other kinds of data. stoppag~s \Vould Je equally divided among the
five machines. Thus, each machine Vlould have
One of the 1Il0st','Jideluysed analytical methods heen (:X;n:cted lo stop on 1'19/S ?9.B OCCiJ-
for dealing Vlith frequency data, i.e., data sion<,. \·lhatis noVi of primary interest is hOlI
obtained by counting, is based on the x' dis- the observed frequencies of stoppages shoVin in
tribution. To see hoVi this arises, consider TablecJ.~viate from the expected frequen-
the folloYling example.
cies jUot calculated. These deviations are
Over a peri~d of a month, the numbers of shoVin in Table 9.2; obviously, the greater tle
stoppages on a set of five similar machines deviations, the less credence Vie would give to
performing similar tasks were counted, with the null hypothesis. We therefore need to de-
the results sho\Vn in Table 9.1. velop i method for concisely summarizing the
magnitudes of the deviations that will form
Examination of these data suggests that ma- the basis of a test of significance.
chines 2 and 3 apparently stop more frequent-
ly than the other machines. The question that To do this, we appeal to some of the statis-
arises is Vlhether the data really imply that tical tneory described in Chapters 5 and 6.
there are significant differences in the stop- Imagine anyone of the machines 'being observed
page rates of the five machi nes or Ylhether the over m3'lY months, the number of stoppages in
differences in the observed stoppage rates can each month being counted. These numbers would
be explained purely by chance. vary fr~m month to month and, from (I; 5.5) we
would Expect that, if the stoppages were oc-
Questions of this kind are answered by tests curring at random, this variation could be
of significance. The appropriate null hypothe- described by a Poisson distribution. Conse-
sis in this case is quently, the frequencies fo actually observed
during the singlt month of the experiment can
H, : there are no differences in stoppage be thought of as values of a Poisson variable.
rates among the machines; Our be't estimate of the mean of this distri-
bution (if Ho is true) is the expected fre-
quency fe, and, because for a Poisson variable
the st'ndard deviation is equal to the square
root of the mear, an estimate of its standard
deviati Jn is fe , i.e.,

a = fe! .

I~achine No. 1 2 3 4 5 Total
Number of stoppages 27 36 38 26 22 149

Table 9.2 Observed and Expected Frequencies for Example 9.1

I~achine No. 23 5 Totals

Observed 27 36 38 26 22 149,.
f'equenc,es fo 29.8 29.8 29.8 20.8 29.8
-2.8 6.~ 8.2 -3.8 -7.8 149.0
Expected 0.263 1.290 2.256 0.485 2.042
frequencies fe 0.0

Deviations 6.336 2
(fo -fe)
Xo
_.u2 = (fo-fe)2./fe

r-urU1l'r,"1(' ';,1\i1ll (I; ~,.l)) L1'"L, provided

that U1e l1I.e.1I i~ 1"'"'I"1'lInuqh,t.he 110rl1l,d11i s-

t.ribulinll (till \)(' U'.\'d ,l~~ dll \lpproxilll,lLioll Lv as ',hown in Fig. 9.1. Table 1\3 lhen sho". ! or
4 degrees of freedol1l,tllat
the Poisson rli<jll~ihlllinfl. !Illl~,.Illlder lht'<)l;

circul1lstal1cPs,the obsprved frequencies fo can

be fflC]llt'dt'd rl\ Vil\\Ir\ o!" (J 110 l'll\,l 1 v,1Y"i,1!Jlp

\oJiLI1 ll1eilll IIIIU ..;l..nld,ll'd dl'vi~\\.ioll qivcn I)y

[qu"til1ns (9.1). U'-.inqthe li'H'"r lral1sf01-- Usi~g the usual criterion that significance
l1Iatlon(1; S.6, Equation 5.16), \'Ietherefore
le,~ls greater than 0.05 (5%) are not enough

see thJt the quanlities to reject the null hypothesis, we conclude

tha~ the evidence contained in the data is not

suf'icient to indicate real difference:; in

sto)ping rate among the machines. ***

are values of a standard normal variable with
zero mean and unit standard deviation, and
values of

Now in (I; 6.5) the sum of squares of inde- Although the procedure for :alculating Xo has
pendent standard normal variables was seen to
been developed with reference to a specific
naveax' distribution. Hence the sum of the example, the technique is, in fact, qu'te gen-
er~l and can be used for analysing datil aris-
entries in the last row of Table 9.2 will tend ing in a number of apparently different situ-
to have such a distribution, and this fact ations. What characterizes them all, however,
forms the basis of the test of significance. is that a set of expected frequencies fe can
Before we can proceed, however, we need to be ~alculated, based on the assumption that a
know how many degrees of freedom are associ- certain null hypothesis is true, and these
ated with our value of x'. It was emphasized hav€ to be compared with some observed fre-
above that x' is,the sum of squares of inde- quencies fo to provide a test of the hypothe-
pendent standard normal variables. The values sis. Thus we always calculate
~ this experiment are not, in fact, all
independent. The reason for this is that, in and compare the result with values of x' tabu-
calculating the expected frequencies, we made latpd in Table A3. Some further examples will
sure that their sum was equal to the total make the procedure clear.
observed stoppages. Conse~uently, the devia-
tions (fo-fe) always add to zero. Knowing Before discussing them, however, it must be
this, we find that only four of the deviations noted that the Formula (9.2) was derived by
are independent; for, if four cf them are using the assumption that the Poisson distri-
known, the fifth is determined, since the sum bution can be approximated by the normal dis-
of the deviations must be zero. tribution. This requires that the expected
frequencies should be' reasonably large. A gen-
For this experiment, therefore, the observed er~1 rule is that all the expected frequencies
val ue of x 'is should be greater than 5, though there is some
evidence to suggest that in some applications
with 4 degrees of freedom, end this can be this is rather conservative.
used to test the null hypothesis.
So far as the number of degrees of freedom is
The situation is shown in Fig. 9.1. con~erned, the only general rule that can be
sta~ed is that it is equal to the number of
If the null hypothesis is true and if the ex- independent deviations, all totals being re-
periment were repeated many times, the values garded as fixed. Again, some further examples
of x' thus obtained wou'd vary according to will clarify this.
the distribution shown. The value of x' ob-
served in the experiment actually carried out A company has four factories. Over a period of
tim2, the numbers of absences from work were
was xb = 6.336, and this is also indicated on recorded at each factory with the results
shown in Table 9.3. The numbers of absences
the diagram. Now, large deviations from the per employee for each factory are shown in the
expected frequencies are reflected in large
values of xo; hence the significance level of
the experiment is the probability of finding
x2 values >xb. i.e.,

Factory A B C 0 Totals

Aveyage number of 212 , 451 109 29B 1070
employees during 72 27 34 174
period 0.2477 0.1141 0.1626
0.1596
Number of absences 41
during period 0.1934

1----

Absences per
employee

Note: The symbol *** is used throughout the text to
denote the end of an examp Ie. In some instances, the

exa~ple is continued later.


Click to View FlipBook Version