ANSWERS TO SELECTED EXERCISES 579
(f) There is no consistent trend across months 3, 6, 9, and 12 in the log odds
of positive cultures for either treatment group, so the model with post
versus baseline is preferred. Group A log odds across months: –1.40,
–1.33, –1.22, –1.40, –1.12. Group B log odds across months: –0.74,
–0.86, –0.77, –0.71, –0.80.
(g) The treatment by post interaction can be removed, also oral candidiasis at
baseline and vaginal candidiasis at baseline. Post is retained as important
to the design of the trial (indicates visits post‐randomization).
12.6 (a) and (b) Sample variability and skewness in CFU are not quite constant
across groups; the Placebo group has consistently slightly lower vari-
ability and the No Drug group has consistently slightly higher skewness.
Within group, the variability and skewness are approximately constant
across weeks. Placebo group has consistently higher means and medians,
although at baseline especially they are not much different from the two
other groups.
(g) For sex, z = 2.27 and p = 0.13, so we drop it from the model. For DMFT
teeth, z = –0.02 and p = 0.89, so we drop it from the model.
(h) Both terms in the treatment by smoker by weeks interaction are
strongly statistically nonsignificant (z = –0.02 and p = 0.89 for whether
smoker by weeks differs for drug vs. no drug, z = –0.59 and p = 0.44
for whether smoker by weeks differs for drug vs. placebo), so we drop
the three-way interaction from the model. The weeks by smoker inter-
action is also strongly nonsignificant (z = 0.11 and p = 0.74), so we
drop it as well. These model comparisons could instead be done using
QIC; QIC is given automatically in SAS using PROC GENMOD and
can be called in R using the QIC function in the MESS package.
(i) For a final model that includes main effects for treatment, weeks, and
smoker, and interactions for treatment by weeks and treatment by
smoker, examining treatment group differences overall (averaged over
weeks and over smoking blocks) may be misleading. At week 3 in the
non-smoker block, using Tukey adjustment for multiple comparisons,
drug was no different from no drug (difference = –0.06, z = –0.29, p =
0.96) but strongly significantly different from placebo (difference =
–0.55, z = –3.14, p = 0.0048), while no drug vs. placebo had a similar
effect size (difference = –0.48) but a larger significance level (z = –2.01,
p = 0.11). At week 3 in the smoker block, some effect sizes were smaller;
using Tukey adjustment for multiple comparisons, drug was no different
from no drug (difference = 0.29, z = 0.85, p = 0.67) and no different from
placebo (difference = 0.18, z = 0.59, p = 0.83), while no drug was no
different from placebo (difference = –0.11, z = –0.36, p = 0.93).
580 ANSWERS TO SELECTED EXERCISES
Chapter 13
13.8 Log‐rank test: p = 0.0896; generalized Wilcoxon test: p = 0.1590.
13.10 95% confidence interval for odds ratio: (1.997; 13.542); McNemar’s
chi‐square: χ2 = 14.226; p value = 0.00016.
13.11 McNemar’s chi‐square: χ2 = 0.077; p value = 0.78140.
13.12 95% confidence interval for odds ratio: (1.126; 5.309); McNemar’s
chi‐square: χ2 = 5.452; p value = 0.02122.
13.13 For men: McNemar’s chi‐square, χ2 = 13.394; p value = 0.00025. For
women: McNemar’s chi‐square, χ2 = 0.439; p value = 0.50761.
Chapter 14
14.2 0.93 3 0.1 0.9 2 0.9 3 0.23 3 0.2 2 0.8 3 0.2 0.8 2 1 0.8 3
0.264.
14.3 At each new dose level, enroll three patients; if no patient has DLT, the trial
continues with a new cohort at the next higher dose; if two or three experi-
ence DLT, the trial is stopped. If one experiences DLT, a new cohort of two
patients is enrolled at the same dose, escalating to next‐higher dose only if no
DLT is observed. The new design helps to escalate a little easier; the result-
ing MTD would have a little higher expected toxicity rate.
14.4 [0.63 + 3(0.4)(0.6)2(0.6)3]{0.53 + 3(0.5)3 + 3(0.5)3[1 − (0.5)3]} = 0.256.
14.5 z1−β = 0.364, corresponding to a power of 64%.
14.6 z1−β = 0.927, corresponding to a power of 82%.
14.7 z1−β = 0.690, corresponding to a power of 75%.
14.8 z1−β = 0.551, corresponding to a power of 71%.
14.9 d = 42 events and we need
N 2 42 120 subjects
2 0.5 0.794
or 60 subjects in each group.
14.10 n 1.96 2 0.9 0.1 139 subjects.
0.05 2
If we do not use the 90% figure, we would need
nmax 1.96 2 0.25 385 subjects.
0.05 2
ANSWERS TO SELECTED EXERCISES 581
14.11 nmax 1.96 2 0.25 99 subjects.
0.01 2
14.12 nmax 1.96 2 0.25 43 subjects.
0.15 2
14.13 (a) With 95% confidence, we need
nmax 1.96 2 0.25 9604 subjects.
0.01 2
With 99% confidence, we need
nmax 2.58 2 0.25 16,641 subjects.
0.01 2
(b) With 95% confidence, we need
nmax 1.96 2 0.08 0.92 2827 subjects.
0.01 2
With 99% confidence, we need
nmax 2.58 2 0.08 0.92 4900 subjects.
0.01 2
14.14 n 1.96 2 1 16 subjects.
0.5 2
14.15 N 4 1.96 2 400 62 or 31 per group.
10 2
14.16 n e ln 0.9 ln 0.05 e(ln 0.8 ln 0.95 ln 0.2 ln 0.05 .
n1 ln 0.8 ln 0.95
8, n 2 2, n 3 11, n 4 20, n 5 29, and so on.
14.17 N 4 1.96 1.28 2 2.28 2 220 or 110 per group.
14.18 N 12
14.19 N
4 1.96 1.65 2 0.97 2 50 or 25 per group.
12
4 1.96 1.28 2 10.3 2 496 or 248 per group.
32
582 ANSWERS TO SELECTED EXERCISES
14.20 N 4 2.58 1.28 2 0.075 0.925 1654 or 827 per group.
0.05 2
14.21 N 4 2.58 1.28 2 0.12 0.88 1630 or 315 per group.
0.1 2
14.22 N 4 1.96 0.84 2 0.275 0.725 70 or 35 per group.
0.15 2
14.23 N 4 1.96 0.84 2 0.3 0.7 42 or 21 per group.
0.2 2
14.24 d = 0.196, almost 20%.
14.25 ln 0.6
d ln 0.7
N
1.432.
1.96 0.84 2 1 1.432 2
1 1.432
249 events.
2 249
2 0.6 0.7
710 subjects or 355 per group.
14.26 ln 0.4
d ln 0.5
N
1.322.
1.96 0.84 2 1 1.322 2
1 1.322
408 events.
2 408
2 0.4 0.5
742 subjects or 371 per group.
14.27 ln 0.4
d ln 0.6
N
1.794.
1.96 0.84 2 1 1.794 2
1 1.794
98 events.
2 98
2 0.4 0.6
98 subjects or 49 per group.
ANSWERS TO SELECTED EXERCISES 583
14.28 aπn1 d=706.818c:o(natr)oNls;=(c5)9m0,=26465 cases and 245 controls; (b) N = 960, 192 cases
discordant pairs and M = 271 case–control pairs.
14.29 π1 = 0.57: (a) N = 364, 182 cases and 182 controls; (b) N = 480, 120 cases
and 350 controls; (c) m = 81 discordant pairs and M = 158 case–control pairs.
14.30 N 4 2 1.96 .84 2
ln 1.5
192;96 cases and 96 controls.
Index
addition rule, 106, 126 mean, 128
adjacent values, 76 variance, 128
adjusted rate, 13, 15 binomial probability, 126
agreement, 112 bioassay, 330
AIC. see Akaike Information Criterion blinded study
Akaike Information Criterion, 418 double, 496
alpha, 198 triple, 496
analysis of variance (ANOVA), 253, 273 block, 273, 280
analysis of variance (ANOVA) table, 254, complete, 281
fixed, 281
276, 310 random, 284
antibody response, 316 blocking factor. see block
antilog. see exponentiation Bonferroni’s type I error adjustment, 258
area under the density curve, 117, 124 box plot, 76
average. see mean
case–control study, 2, 130, 199, 358,
bar chart, 7 439, 494
baseline hazard, 451, 456, 460
Bayesian Information Criterion, 418 matched, 518
Bayes’ theorem, 111 pair matched, 464
Bernoulli distribution, 354 unmatched, 516, 520
censoring, 442
mean, 359 censoring indicator, 443
variance, 360 central limit theorem, 115, 125, 146, 153,
better treatment trials, 505
BIC. see Bayesian Information Criterion 182, 198, 236
binary characteristic, 2 chance, 103, 116
binary data. see variable, binary change rate, 10
binomial distribution, 126, 132 chi‐square distribution, 125
Introductory Biostatistics, Second Edition. Chap T. Le and Lynn E. Eberly.
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc.
Companion website: www.wiley.com/go/Le/Biostatistics
586 Index
chi‐square test, 212, 458, 470, 471 contingency table, 22, 197, 211
difference in proportions, 203 contingency table, ordered, 219
generalized Wilcoxon, 449 continuity correction, 210, 215
likelihood ratio, 366, 368, 370, 394, 399, continuous data, 318
405, 417, 458, 480, 482 correlation, 78, 299
log rank, 449
Mantel–Haenszel, 207 autoregressive (AR), 425
McNemar’s, 200, 467 compound symmetry, 413, 425
Pearson’s, 212, 387 exchangeable (see correlation, compound
score, 366, 448, 480
Wald, 480 symmetry)
Yates’ corrected, 215 induced, 412
inter-correlation, 300
clinical trial, 358, 494 intra‐class (ICC), 413
phase I, 497 intra-correlation, 300
phase II, 499 Kendall’s tau (τ), 84
phases I‐IV, 495 non‐parametric, 83
Pearson’s r, 80, 83
clustered study, 409 Spearman’s rho (ρ), 83
coefficient of correlation, 300 unstructured, 425
coefficient of determination, 308, 310 working, 425, 428
coefficient of multiple determination, 321 correlation coefficient, 81, 307
coefficient of variation, 76 covariate. see predictor variable
cohort‐escalation study, 497 covariate, time dependent, 461
cohort study, 14, 130, 385, 439, 494 Cox model. see proportional hazards model
common odds ratio, 26 cross‐classified table. see contingency table
comparisonwise error, 258 crossing survival curves, 449
complete case analysis, 410 crude rate, 13
compound event, 126 cure model, 449
concordance, 22, 84, 112, 219 cut point, 183, 184, 187, 188
category‐specific, 112 death rate
overall, 112 adjusted, 13, 16
conditional independence, 206 crude, 13
conditional logistic regression, 472 follow‐up, 14
confidence interval, 142, 146, 192
for a correlation coefficient, 161 death set, 452, 462
for a difference of means, 152 decision making rule. see cut point
for a difference of proportions, 157 degrees of freedom, 75, 125, 212, 236, 242,
effect of sample size on, 148
for a hazard ratio, 453, 457 254, 275, 310, 321, 387
for a mean, 148 density, 114
for a odds ratio, 157, 356, 363, 426, 467, density curve, 115, 117, 124
dependent variable. see response variable
475, 479 derived variable analysis, 410
for a paired mean difference, 152 deterministic relationship, 79
for a proportion, 154 deviation, 73
for a regression coefficient, 415 diagnostic procedure, 5
relation to p value, 191 diagnostics, 287, 302, 419
for a relative risk, 394 dichotomous characteristic. see variable, binary
confidence level, 148 dichotomous data. see variable, binary
confounder, 3, 15, 25, 131, 151, 165, 199, difference of means, 152
difference of proportions, 202
206, 238, 464
Index 587
direct method, 16 Fisher’s transformation, 503
discordance, 21, 23, 84, 219, 466, 518 fixed effect, 280
discrete data. see variable, discrete force of mortality. see hazard function
disease registry, 5 frequency
dispersion, 73–78, 360, 402, 403. see also
cumulative, 64
variance cumulative relative, 64
distribution relative, 104
frequency distribution, 56, 114
sampling, 147, 157, 160 frequency polygon, 60
skewed, 63, 71, 73, 76 F statistic, 255, 276–278, 284, 310, 322,
symmetric, 63, 76
unimodal, 61 414, 425, 428
DLT. see dose‐limiting toxicity full model, 276
dose‐limiting toxicity, 497
dose‐response, 314, 317 Gaussian distribution. see Normal
dummy variable, 302, 318, 332, 351, 363, distribution
393, 397, 399, 412, 443, 456, 463, 472, GEE. see Generalized Estimating Equations
477, 478 Generalized Estimating Equations, 425, 428
effect model‐based standard error, 425
interaction, 274, 277 robust (empirical) standard error,
main, 274, 277, 400
modification, 4, 206–207, 274, 276–278, 425, 428
281, 312, 319, 364, 368, 399, 457, generalized odds, 22
459, 479 general linear F test, 276
simple, 274, 277 gold standard, 112
goodness of fit, 360, 402, 416, 462
estimate, 141 goodness of fit statistic. see chi‐square test,
interval, 146
point, 145 Pearson’s
estimator, unbiased, 143 hazard function, 441
event time, 443 hazard ratio, 441, 442, 453, 514
exact statistic, 249, 250 hazard ratio, constant, 442
exclusion criteria, 496 histogram, 60, 114
expected deaths, 16, 29, 447 hypothesis, 181
expected frequencies, 212
expected value, 303, 320 alternative, 181, 198
experimental study. see randomized study composite, 193
experimental unit, 280 global null, 277, 322, 366, 394, 458, 479
experiment wise error, 258 null, 181, 197, 235
explanatory variable. see predictor variable omnibus (see hypothesis, global null)
exponential growth (decay), 315, 316 simple, 193
exponentiation, 158, 161 hypothesis test, 181
exposure, 2
incidence, 13
factorial, 273, 274 inclusion criteria, 496
factors. see factorial independence null hypothesis, 212
false negative, 6, 186 independent events, 108, 126
false positive, 6, 107, 186 independent trials, 126
F distribution, 125 independent variable. see predictor variable
Fisher’s exact statistic, 217 indicator variable. see dummy variable
infant mortality rate, 129
interaction. see effect modification
588 Index
intercept, 301, 302 McNemar’s chi‐square test, 200, 476
inter‐correlation, 300, 313 mean, 67, 69
inter‐quartile range, 89
interval density, 60 geometric, 71
interval midpoint, 70 square, 254, 275
intra‐correlation, 300 measurement scale, effect of, 454
IQR. see inter‐quartile range median, 65, 72, 76
median effective dose, 314
Kaplan–Meier curve, 444 median test. see Wilcoxon rank sum test
kappa, 113 midrange, 89
misclassification, 5
category‐specific, 114 missing data, 364, 394, 410
overall, 114 mode, 73
problem with, 114 morbidity, 13
k samples, binary, 215 mortality, 13
MTD. see maximum tolerated dose
least squares estimation, 303, 320 multi‐level model, 421
likelihood function, 164, 354, 363, 391, 393, multiple comparisons adjustment,
469, 473, 476, 478 258, 283
likelihood ratio test. see chi‐square test, multiple testing, 369, 399
multiplication rule, 108, 126, 212
likelihood ratio
linear association, 81 negative predictive value, 110
linearity, 302, 318, 364, 393, 416, 454, 457, 479 nested models, 417
linear mixed model, 411 normal curve, 114
normal distribution, 124, 290
conditional mean, 424
marginal mean, 424 mean, 116
population‐average intercept, 411 variance, 116
random intercept, 411
random slope, 415 observational study, 281
subject‐specific intercept, 411 observed size. see p value
line graph, 9 odds, 19, 157
log hazard, 442 odds, generalized, 21, 219
logistic regression, 352, 424 odds ratio, 18, 108, 131, 158, 355, 363, 426,
logistic regression, conditional, 472
lognormal distribution, 125 466, 516, 518, 520
log odds, 357, 424 as approximation to relative risk, 19,
log rank test, 448
log rank test, stratified, 460 131, 359
longitudinal study, 409 Mantel–Haenszel, 26, 207, 469
matched pairs, 132, 165, 469, 475
Mantel–Haenszel odds ratio, 26, 206 omnibus hypothesis. see hypothesis,
margin of error, 499, 501
matching, 131, 199, 472 global null
one‐sample
advantages and disadvantages, 464
efficiency, 468 binary, 197
multiple‐to‐one, 468 continuous, 235
one‐to‐one, 466 one‐sided test, 133, 188, 198, 202,
maximum likelihood estimation, 164, 355,
236, 242
391, 393, 414 one‐tailed. see one‐sided test
maximum tolerated dose, 495 ordered contingency table, 21
outlier, 76
overdispersion, 359, 387, 402
Index 589
paired‐sample, non‐parametric, 250 prospective study, 2, 130, 358, 439, 494
pair‐matched p value, 189, 194, 199
p value, relation to confidence interval, 191
binary, 130, 199
case‐control study, 130 QIC. see quasi‐likelihood information
continuous, 237, 250 criterion
pairwise comparisons, 258
parallel lines assumption, 460 quasi‐likelihood information criterion,
parameter, 116, 141, 143, 198, 236 425–426, 428
partial likelihood function, 452, 456, 462
Pearson’s chi‐square test, 387 random effect, 280, 410, 414, 421
percentile, 64, 76 randomization, 496
percentile score, 64 randomized complete block design, 419
person‐years method, 14, 385 randomized study, 281, 283
pie chart, 8 random sampling, 493
placebo, 496 random selection, 103
Poisson distribution, 128, 384 range, 56, 73
mean, 129, 384 rate, 10
offset (see Poisson distribution, size) ratio, 18
relation to binomial, 384, 391 receiver operating characteristic (ROC)
size, 389, 391, 427, 431
variance, 129, 384 curve, 373, 374
Poisson regression, 383, 427 reduced model, 276
polytomous data, 318 reference group, 21, 28, 364, 399, 442
pooled variance, 242 regression, 297, 299
population, 116, 145, 182
average coefficient, 424 coefficient, 302, 318, 356, 363, 391, 393,
target, 104, 305 410, 425, 428, 453, 458
positive predictive value, 110
power, 193, 509 logistic, 353, 363
predicted value, 287, 303, 320 multiple, 318, 351, 362, 393, 410, 423,
prediction, 297, 299, 303
predictor variable, 274, 297, 351, 383, 450 456, 478
prevalence, 5, 103, 111 Poisson, 389, 393
primary endpoint. see primary outcome polynomial, 319, 365, 396
primary outcome, 99 simple, 351, 389
probability, 103, 104, 117 simple linear, 298, 301
conditional, 108 stepwise, 331, 332, 334, 351, 365, 369,
joint, 106
marginal, 106, 212 404, 459, 483
unconditional, 109 rejection region, 187–189, 198, 236, 242
univariate, 107 relative frequency, 58, 114
probability density function, 124, 129, 164 relative hazard. see hazard ratio
product‐limit estimation, 444 relative risk, 18, 29, 359, 391, 466,
proportion, 1, 77, 103, 104, 153, 198
proportional hazard assumption, 459, 462 516, 518
proportional hazards, 514 repeated measures, 409
proportional hazards model, 442, 451, 456 replication, 281, 283, 418
proportional hazards model, for matched reproducibility, 112, 145
residual, 287, 416
pairs data, 475 residual, studentized, 287
response variable, 273, 297, 352, 383,
409, 473
retrospective study, 2, 130, 358, 439, 494
risk, 391
risk factor, 2, 298
590 Index
risk function. see hazard function small sample test, 217
risk ratio, 18 spaghetti plot, 411
risk set, 452, 462 specificity, 5, 110, 186, 373
R‐square, 310, 321 specific rate, 13
staggered entry, 440
sample, 104, 116, 145, 182 standard deviation, 74, 78, 146
paired, 151 standard error, 146, 298
pair matched, 151, 165
small, 148, 149 of a difference of means, 152
two independent, 152 of a difference of proportions, 157
of a mean, 148
sample mean, 115 of a proportion, 153
sample proportion, 115 standardize, 122, 147, 198, 219, 236, 247,
sample size, 499, 501, 502, 505–507, 509,
250, 308, 366, 396, 448
512, 514, 516, 518, 521 standardized mortality ratio, 28
sampling standardized rate, 13, 15
standard normal distribution, 116, 124, 198
for a block design, 281 standard normal score. see z statistic
random, 105 standard population, 16
repeated, 104, 116, 143, 147, 182, 187, statistic, 1, 116, 143, 198, 236, 298
statistical association, 22, 78, 79, 106, 108,
198, 236
without replacement, 143 113, 297, 299
sampling distribution, 182, 198, 236 negative, 79, 81
sampling frame, 105 positive, 79, 81
sandwich estimator. see Generalized statistical inference, 141, 145
statistical relationship. see statistical
Estimating Equations, robust
standard error association
scaled deviance, 360, 402 stem‐and‐leaf plot, 68
scaled Pearson chi‐square, 360, 402 stratification, 459
scatter diagram. see scatter plot stratification, for matched pairs, 475
scatter plot, 55, 79, 302 subject specific coefficient, 424
score equation, 425, 428 sum of squares
score test. see chi‐square test, score
screening test, 5, 106, 186, 314, 372 between (SSB), 254, 275
seasonality, 333 within (SSW), 253, 275
sensitivity, 5, 107, 110, 186, 373 error, 275, 310, 321
separation power, 373, 374 model, 275
separator variable, 372, 374 regression, 310, 321
sequential probability ratio test, 507 total (SST), 253, 275, 309, 321
significance survey study, 493, 499
level, 188, 198 survival curve, 441
practical, 190 survival data, 440
statistical, 180, 188, 190 survival function, 441
test, 179 survival rate. see survival function
significant difference, minimum clinical, survival time, 440
505, 510
Simon two‐stage phase II study, 504 target population, 493
size of test. see type I error t distribution, 125, 149, 236, 242
skewness, 290
slope, 301, 302 mean, 125
variance, 125
Index 591
test, 141 variable, 55, 124
test for independence, 212, 307 Bernoulli, 124
test statistic, 182, 187 binary, 2, 77, 126, 318, 352
treatment factor, 280 binomial, 352
t statistic, 308, 310, 323, 324, 326, 352, 414 categorical, 197
continuous, 55, 124, 297
one‐sample, 236 dichotomous (see variable, binary)
paired sample, 238, 476 discrete, 55, 77, 125
two‐sample, 242, 255 point binomial (see variable, Bernoulli)
t test. see t statistic polytomous, 2, 298, 311, 318, 363,
Tukey’s type I error adjustment, 259 393, 456
two‐sample
binary, 202 variance, 73, 77
non‐parametric, 246
two‐sided test, 133, 188, 198, 202, 236, Wilcoxon generalized test, 448
Wilcoxon rank sum test, 246, 260
242, 307 Wilcoxon signed rank test, 250, 476
two‐tailed. see two‐sided test
two‐way or 2x2 table. see contingency table Yates’ corrected chi‐square test, 215
type 1 analysis, 401
type 3 analysis, 400 z score. see z statistic
type I error, 180, 182, 185, 188, 252, 258, z statistic, 116, 122, 128, 129, 198, 200, 202,
369, 399, 404, 459, 509 207, 219, 247, 250, 367, 396, 425, 428,
type II error, 180, 182, 185, 187, 509 448, 449, 458, 467, 480
z test. see z statistic
unit of observation, 389
WILEY END USER LICENSE AGREEMENT
Go to www.wiley.com/go/eula to access Wiley’s ebook EULA.