Newsom, USP 654 Data Analysis II, Fall 2015 1
Curvilinear Regression Example
Canadian NPHS Check-up Recency Regressed on Weight
SPSS
Plot
8
7
Recency of Physical Check-up 6
5
4
3
2
1
0
10 20 30 40 50
Body Mass Index
Syntax
output close *.
get file='c:\jason\spsswin\da2\hbs_2.sav'.
* Random sample of approx 1200 older adults from the Canadian NPHS.
*make sure n the mean is based on is the same used in the regression model.
count nmiss=checkup weight (missing).
select if nmiss=0.
*mean center the weight variable.
compute id=$casenum.
aggregate /mweight=MEAN(weight).
compute cweight=weight - mweight.
*compute quadratic effect variable.
compute cweight2=cweight*cweight.
regression vars=checkup cweight cweight2
/descriptives=mean stdev n sig corr
/statistics=anova r coeff ses cha
/dependent=checkup
/method=enter cweight cweight2.
Newsom, USP 654 Data Analysis II, Fall 2015 2
Output
Regression N�
1191�
Descriptive Statistics�
Mean� Std. Deviation� 1191�
1191�
2.28� 2.027�
cweight� .00000� 4.114211�
cweight2� 16.91252� 29.174028�
Correlations
CHECKUP CWEIGHT CWEIGHT2
Physical
Check-up
Pearson Correlation CHECKUP 1.000 .000 .027
Sig. (1-tailed) Physical Check-up
N CWEIGHT .000 1.000 .361
CWEIGHT2 .027 .361 1.000
CHECKUP
Physical Check-up . .496 .179
CWEIGHT
CWEIGHT2 .496 . .000
CHECKUP .179 .000 .
Physical Check-up
CWEIGHT 1191 1191 1191
CWEIGHT2
1191 1191 1191
1191 1191 1191
Model Summary
Change Statistics
Model R R Square Adjusted Std. Error of R Square F Change df1 df2 Sig. F Change
1 R Square the Estimate Change .481 2 1188 .618
.028a .001
-.001 2.028 .001
a. Predictors: (Constant), CWEIGHT2, CWEIGHT
ANOVAb
Model Regression Sum of df Mean Square F Sig.
1 Residual Squares 2 1.979 .481 .618a
Total 4.113
3.959 1188
4886.375 1190
4890.334
a. Predictors: (Constant), CWEIGHT2, CWEIGHT
b. Dependent Variable: CHECKUP Physical Check-up
Coefficientsa
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta Std. Error t Sig.
1 (Constant) 32.436 .000
2.245 .069 .731
-.344 .327
CWEIGHT -.005 .015 -.011 .031 .981
CWEIGHT2 .002 .002 .031 .031
a. Dependent Variable: CHECKUP Physical Check-up
Newsom, USP 654 Data Analysis II, Fall 2015 3
R (some output omitted for brevity)
> #clear active frame from previous analyses
> rm(mydata)
>
>
> library(lessR)
> mydata = Read("c:/jason/spsswin/da2/hbs_2.sav",quiet=TRUE)
> #Random sample of approx 1200 older adults from the Canadian NPHS.
>
> #need to convert data types in order to compute a correlation in R--lessR function
> mydata <-Transform(CHECKUP = as.numeric(CHECKUP))
>
> library(car)
>
> #test simple regression first and examine residual plot
> residualPlots(lm(CHECKUP ~ WEIGHT, data = mydata), fitted=FALSE)
Initial Plot of the Residuals from Linear Regression
Pearson residuals
1234
-1 0
15 20 25 30 35 40
WEIGHT
> #listwise deletion to match n from regression when centering
> mydata <-Subset(WEIGHT!='NA' & CHECKUP!='NA')
> #mean center WEIGHT
> mydata$cweight <- scale(mydata$WEIGHT, center = TRUE, scale = FALSE)
> ss.brief(mydata) #check that mean of cweight is zero
> #compute quadratic effect variable
> mydata$weight2=mydata$cweight^2
>
> #test linear with quadratic effect
> Regression(CHECKUP ~ cweight + weight2, res.sort="off")
Newsom, USP 654 Data Analysis II, Fall 2015 4
Response Variable: CHECKUP, Physical Check-up
Predictor 1: cweight,
Predictor 2: weight2,
Number of cases (rows) of data: 1191
Number of cases retained for analysis: 1191
BASIC ANALYSIS
Estimated Model
(Intercept) Estimate Std Err t-value p-value Lower 95% Upper 95%
cweight 2.24 0.07 32.436 0.000 2.11 2.38
weight2 0.02 -0.344 0.731 0.02
-0.01 0.00 0.981 0.327 -0.04 0.01
0.00 -0.00
Model Fit
Standard deviation of residuals: 2.03 for 1188 degrees of freedom
R-squared: 0.001 Adjusted R-squared: -0.001 PRESS R-squared: -0.005
Null hypothesis that all population slope coefficients are 0:
F-statistic: 0.481 df: 2 and 1188 p-value: 0.618
Analysis of Variance
cweight df Sum Sq Mean Sq F-value p-value
weight2 1 0.00 0.00 0.00 0.991
Model 1 3.96 3.96 0.96 0.327
2 3.96 1.98 0.48 0.618
Residual Plot for Model with Quadratic Effect (flat fit line, but, ooh, I don't like those outliers much!)
Residuals 10
-2 -1 0 1 2 3 4 5
2.3 2.4 2.5 2.6 2.7 2.8
Fitted Values
Largest Cook's Distance, 0.14, is highlighte