Regression Wit
Categorical Indepen
Curvilinear Re
Lecture 1
November 12,
ERSH 832
Lecture #11 - 11/12/2008
th Multiple
ndent Variables
egression
11
, 2008
20
Slide 1 of 51
Overview Today’s Lecture
q Today’s Lecture
Multiple Categorical IVs s Multiple categorial indepen
Variable Coding s Curvilinear regression (Cha
Statistical Tests
ANOVA Example
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
ndent variables (Chapter 12).
apter 13).
Slide 2 of 51
Overview Multiple Categorica
Multiple Categorical IVs s As with continuous variable
q Factorial Designs performed with multiple cat
q Types of Variables
q Example Data Set s Multiple categorical variable
x Experimental research: F
Variable Coding x Observational research:
observed.
Statistical Tests
s Multiple variables adds the
ANOVA Example effects between categorica
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
al Independent Variables
es, regression analysis can be
tegorical independent variables.
es can be used in:
Factorial designs.
multiple categorical variables
benefit of studying interaction
al variables.
Slide 3 of 51
Factorial Designs
s In ANOVA, (categorical) ind
called factors.
Overview s The category levels of a fac
Multiple Categorical IVs s A factorial design is a desig
q Factorial Designs combinations of partitions a
q Types of Variables
q Example Data Set s Imagine you have two categ
two levels) and B (with thre
Variable Coding consist of collecting observ
Statistical Tests
ANOVA Example
Curvilinear Regression
Wrapping Up
A1 B1
A2 A1 B1
A2 B1
Lecture #11 - 11/12/2008
dependent variables are often
ctor are called partitions.
gn where all possible
are studied.
gorical IVs in your study: A (with
ee levels). A factorial design would
vations in the following 2 × 3 grid:
B2 B3
1 A1 B2 A1 B3
1 A2 B2 A2 B3
Slide 4 of 51
Overview Factorial Design Ad
Multiple Categorical IVs s Learning about interactions
q Factorial Designs independent variables.
q Types of Variables
q Example Data Set x An interaction is the joint
independent variables on
Variable Coding
s Factorial designs offer grea
Statistical Tests statistical tests by controllin
categorical independent va
ANOVA Example
s Factorial designs are efficie
Curvilinear Regression point) because multiple trea
by a single set of observatio
Wrapping Up
s Because of crossing treatm
more broad.
Lecture #11 - 11/12/2008
dvantages
s between categorical
t effect of two or more
n the dependent variable.
ater control and more sensitive
ng for multiple significant
ariables.
ent (from a experimental stand
atment effects can be determined
ons.
ments, generalizations can be
Slide 5 of 51
Overview Types of Categoric
Multiple Categorical IVs s Pedhazur distinguishes bet
q Factorial Designs variables:
q Types of Variables
q Example Data Set x Manipulated.
Variable Coding x Classificatory.
Statistical Tests s Factorial designs can consi
or of a combination of mani
ANOVA Example variables.
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
cal Variables
tween two types of categorical
ist of manipulated variables (only)
ipulated and classificatory
Slide 6 of 51
Overview Example Data Set
Multiple Categorical IVs Neter (1996, p. 705).
q Factorial Designs
q Types of Variables “A consumer organization stu
q Example Data Set automobile owner on size of c
utilizing 12 persons in each o
Variable Coding middle, elderly) who acted as
medium price, six-year-old ca
Statistical Tests experiment, and the ‘owners’
from 36 dealers selected at ra
ANOVA Example Randomization was used in a
‘owners’. ”
Curvilinear Regression
The offers (in hundreds of do
Wrapping Up website.
Lecture #11 - 11/12/2008
udied the effect of age of
cash offer for a used car by
of three age groups (young,
s the owner of a used car. A
ar was selected for the
solicited cash offers for this car
andom from dealers in the region.
assigning the dealers to the
llars) can be found on the class
Slide 7 of 51
Overview Data Set Specifics
Multiple Categorical IVs s This example illustrates an
q Factorial Designs
q Types of Variables s The experimenter was inter
q Example Data Set gender and age on dealer o
Variable Coding s The experimenter could con
presented to each dealer, a
Statistical Tests assigned to subjects.
ANOVA Example s The experimenter could no
gender and/or age groups,
Curvilinear Regression
s Because the dealer offer wa
Wrapping Up experimenter could manipu
dealer to an experimental g
Lecture #11 - 11/12/2008
experimental research design.
rested in determining the effect of
offers for used cars.
ntrol what subjects were
and dealers were randomly
ot randomly assign subjects to
however.
as the unit of interest, the
ulate and randomly assign each
group.
Slide 8 of 51
Overview Variable Coding
Multiple Categorical IVs s Variable coding for multiple
just as for a single categori
Variable Coding
q Fixed Effects Linear Model s For each variable separatel
q Estimation created using either a dumm
scheme.
Statistical Tests
s Again, for each variable sep
ANOVA Example minus the number of catego
created.
Curvilinear Regression
x For gender: one column.
Wrapping Up
x For age: two columns.
Lecture #11 - 11/12/2008
e categorical variables proceeds
ical variable.
ly, a new set of columns are
my, effect, or orthogonal coding
parately, this means that one
ory levels new columns are
.
Slide 9 of 51
Y Age Gender I A1 A2 G1
21 Y M 11 0 1
23 Y M 11 0 1
19 Y M 11 0 1
22 Y M 11 0 1
22 Y M 11 0 1
23 Y M 11 0 1
21 Y F 11 0 -1
22 Y F 11 0 -1
20 Y F 11 0 -1
21 Y F 11 0 -1
19 Y F 11 0 -1
25 Y F 11 0 -1
30 M M 10 1 1
29 M M 10 1 1
26 M M 10 1 1
28 M M 10 1 1
27 M M 10 1 1
27 M M 10 1 1
26 M F 10 1 -1
29 M F 10 1 -1
27 M F 10 1 -1
28 M F 10 1 -1
27 M F 10 1 -1
29 M F 10 1 -1
25 E M 1 -1 -1 1
22 E M 1 -1 -1 1
23 E M 1 -1 -1 1
21 E M 1 -1 -1 1
22 E M 1 -1 -1 1
21 E M 1 -1 -1 1
23 E F 1 -1 -1 -1
19 E F 1 -1 -1 -1
20 E F 1 -1 -1 -1
21 E F 1 -1 -1 -1
20 E F 1 -1 -1 -1
20 E F 1 -1 -1 -1
9-1
Overview Interaction Coding
Multiple Categorical IVs s To code variable interaction
where the contents come fr
Variable Coding combinations of columns fo
q Fixed Effects Linear Model
q Estimation s For our 3 × 2 example there
Statistical Tests x Two columns for age gro
ANOVA Example x One column for gender.
Curvilinear Regression x 2 (age vectors) × 1 (gend
Wrapping Up s For all examples (today) I w
although everything I state
exception of b coefficients)
as well.
Lecture #11 - 11/12/2008
ns, create a set of new columns
rom the multiplication of all
or each categorical variable.
e are:
oup.
der vector) = 2 interaction vectors.
will show effect coded columns,
about most results (with the
will still apply for dummy coding
Slide 10 of 51
Y Age Gender I A1 A2 G1 A1G1 A2G1
21 Y M 11 0 1 1 0
23 Y M 11 0 1 1 0
19 Y M 11 0 1 1 0
22 Y M 11 0 1 1 0
22 Y M 11 0 1 1 0
23 Y M 11 0 1 1 0
21 Y F 1 1 0 -1 -1 0
22 Y F 1 1 0 -1 -1 0
20 Y F 1 1 0 -1 -1 0
21 Y F 1 1 0 -1 -1 0
19 Y F 1 1 0 -1 -1 0
25 Y F 1 1 0 -1 -1 0
30 M M 10 1 1 0 1
29 M M 10 1 1 0 1
26 M M 10 1 1 0 1
28 M M 10 1 1 0 1
27 M M 10 1 1 0 1
27 M M 10 1 1 0 1
26 M F 1 0 1 -1 0 -1
29 M F 1 0 1 -1 0 -1
27 M F 1 0 1 -1 0 -1
28 M F 1 0 1 -1 0 -1
27 M F 1 0 1 -1 0 -1
29 M F 1 0 1 -1 0 -1
25 E M 1 -1 -1 1 -1 -1
22 E M 1 -1 -1 1 -1 -1
23 E M 1 -1 -1 1 -1 -1
21 E M 1 -1 -1 1 -1 -1
22 E M 1 -1 -1 1 -1 -1
21 E M 1 -1 -1 1 -1 -1
23 E F 1 -1 -1 -1 1 1
19 E F 1 -1 -1 -1 1 1
20 E F 1 -1 -1 -1 1 1
21 E F 1 -1 -1 -1 1 1
20 E F 1 -1 -1 -1 1 1
20 E F 1 -1 -1 -1 1 1
10-1
Overview The Fixed Effects L
Multiple Categorical IVs
Variable Coding s Recall the fixed effects linea
q Fixed Effects Linear Model variable:
q Estimation
Statistical Tests x Effect coding is built to e
ANOVA Example model.
Curvilinear Regression
Wrapping Up Yij =
x Yij is the value of the de
Lecture #11 - 11/12/2008
in group/treatment/categ
x µ is the population (gran
x βi is the effect of group/t
x ǫij is the error associated
group/treatment/category
g
x βi = 0 is the identifia
i=1
Linear Model
ar model for a single independent
estimate the fixed linear effects
= µ + βi + ǫij
ependent variable of observation j
gory i.
nd) mean.
treatment/category i.
d with observation j in
y i.
ability constraint.
Slide 11 of 51
Overview The Fixed Effects L
Multiple Categorical IVs
Variable Coding Yijk = µ + αi +
q Fixed Effects Linear Model s Yijk is the value of the depe
q Estimation
Statistical Tests in group/treatment/category
ANOVA Example
Curvilinear Regression s µ is the population (grand)
Wrapping Up
s αi is the effect of group/trea
Lecture #11 - 11/12/2008 categorical variable.
s βj is the effect of group/trea
categorical variable.
s (αβ )ij is the interaction effe
s ǫij is the error associated w
group/treatment/category i.
s For a given factor, the sum
zero for model identifiability
Linear Model
+ βj + (αβ )ij + ǫijk
endent variable of observation k
y ij .
mean.
atment/category i for the first
atment/category j for the second
ect for group ij .
with observation j in
.
of all effects must be equal to
y.
Slide 12 of 51
Example Estimation
Overview
Multiple Categorical IVs
Variable Coding
q Fixed Effects Linear Model
q Estimation
Statistical Tests
ANOVA Example
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
n Via Regression
Slide 13 of 51
Overview Example Estimation
Multiple Categorical IVs
Variable Coding Effect
q Fixed Effects Linear Model Grand Me
q Estimation
Statistical Tests Young*
ANOVA Example Middle*
Curvilinear Regression Elderly
Wrapping Up
Male*
Lecture #11 - 11/12/2008 Female
Young × M
Young × Fe
Middle × M
Middle × Fe
Elderly × M
Elderly × Fe
* indicates this effect came
n Via Regression
ean* Estimate
* 23.556
* -2.056
y 4.194
-2.138
e 0.389
Male* -0.389
emale -0.222
Male* 0.222
-0.306
emale 0.306
Male 0.528
emale -0.528
directly from the GLM estimates
Slide 14 of 51
Finding Interaction
For interaction coefficients, th
Overview gi
Multiple Categorical IVs (α
Variable Coding i=1
q Fixed Effects Linear Model
q Estimation gj
Statistical Tests (α
ANOVA Example j =1
Curvilinear Regression gi gi
Wrapping Up
Male i=1 j =1
Female
Young
Total -0.222
?
0
Lecture #11 - 11/12/2008
n Coefficients
he identifiability constraint are:
αβ )i· = 0
αβ )·j = 0
(αβ )ij = 0
1
Middle Elderly Total
-0.306 ? 0
? 0
? 0 0
0
Slide 15 of 51
Finding Interaction
Overview Male Young
Female -0.222
Multiple Categorical IVs
Total ?
Variable Coding 0
q Fixed Effects Linear Model
q Estimation Male Young
Female -0.222
Statistical Tests 0.222
Total
ANOVA Example 0
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
n Coefficients
Middle Elderly Total
-0.306 ? 0
? 0
? 0 0
0
Middle Elderly Total
-0.306 0.528 0
0.306 -0.528 0
0 0
0
Slide 16 of 51
Constructing Group
Gr
You
Youn
Mid
Midd
Eld
Elde
Lecture #11 - 11/12/2008
p Mean Estimates
Effect Estimate
rand Mean* 23.556
-2.056
Young* 4.194
Middle* -2.138
Elderly 0.389
-0.389
Male* -0.222
Female 0.222
ung × Male* -0.306
ng × Female 0.306
ddle × Male* 0.528
dle × Female -0.528
derly × Male
erly × Female
Slide 17 of 51
Overview Statistical Tests
Multiple Categorical IVs
Variable Coding s The preceding section disc
Statistical Tests estimates were obtained.
q GLM Package
ANOVA Example s Because we didn’t talk abo
Curvilinear Regression exercise in mathematics on
Wrapping Up
s However, statistical tests ar
regression analysis.
s How can we tell if there is a
x Age?
x Gender?
x Age × Gender?
Lecture #11 - 11/12/2008
cussed how model parameter
out distributions, this was purely an
nly.
re an important part of a
a significant effect of:
Slide 18 of 51
Overview Statistical Tests
Multiple Categorical IVs
Variable Coding s Using the Analyze...Regres
Statistical Tests much directly.
q GLM Package
ANOVA Example s Particularly, this can tell us
Curvilinear Regression
Wrapping Up x The overall regression -
significantly different from
Lecture #11 - 11/12/2008
x Age - if one of the age pa
different than zero a “ma
age would be present.
x Gender - if the gender pa
different than zero a “ma
gender would be presen
x Age × Gender - if one of
was significantly differen
significant interaction of a
ssion...Linear can only tell us so
the hypothesis test for:
at least one coefficient was
m zero.
arameters was significantly
ain effect,” or significant effect of
arameter was significantly
ain effect,” or significant effect of
t.
f the Age × Gender parameters
nt than zero an “interaction,” or
age and gender would be present.
Slide 19 of 51
Example Estimation
Overview
Multiple Categorical IVs
Variable Coding
Statistical Tests
q GLM Package
ANOVA Example
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
n Via Regression
Slide 20 of 51
Overview The GLM Package:
Multiple Categorical IVs
Variable Coding s Instead of having to deciph
Statistical Tests Analyze...Regression...Line
q GLM Package Model...Univariate package
ANOVA Example
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
: An Easier Way
her significant main effects using
ear, the Analyze...General Linear
e provides this information directly:
Slide 21 of 51
The GLM Package:
s In addition, the GLM packa
graphs of the variables:
Overview
Multiple Categorical IVs
Variable Coding
Statistical Tests
q GLM Package
ANOVA Example
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
: An Easier Way
age provides a way to produce
Slide 22 of 51
The GLM Package:
s In addition, the GLM packa
post hoc analyses:
Overview
Multiple Categorical IVs
Variable Coding
Statistical Tests
q GLM Package
ANOVA Example
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
: An Easier Way
age provides a way to produce
Slide 23 of 51
The GLM Package:
s In addition, the GLM packa
contrasts:
Overview
Multiple Categorical IVs
Variable Coding
Statistical Tests
q GLM Package
ANOVA Example
Curvilinear Regression
Wrapping Up
Lecture #11 - 11/12/2008
: An Easier Way
age provides a way to produce
Slide 24 of 51