Problem 1 (9 Points) Use the data in BWGHT.RAW for this problem.
(i) How many women are in the sample, and how many report smoking during
(ii) Among women who smoked during pregnancy, what is the average number of
cigarettes smoked per day? (1)
(iii) Find the average of fatheduc in the sample. Why are only 1,192 observations
used to compute this average? (2)
(iv) Report the average family income and its standard deviation in dollars. (4)
Problem 2 (9 Points) Use the data in WAGE2.RAW to estimate a simple
regression explaining monthly salary (wage) in terms of IQ score (IQ).
(i) Estimate a simple regression model where a one-point increase in IQ changes
wage by a constant dollar amount. Use this model to ﬁnd the predicted
increase in wage for an increase in IQ of 20 points. Does IQ explain most of
the variation in wage? (5)
(ii) Now, estimate a model where each one-point increase in IQ has the same per-
centage eﬀect on wage. If IQ increases by 20 points, what is the approximate
percentage increase in predicted wage? (4)
Problem 3 (3 Points) Which of the following can cause OLS estimators to be
(ii) Omitting an important variable.
(iii) A sample correlation coeﬃcient of .95 between two independent variables both
included in the model.
Problem 4 (3 Points) Which of the following can cause the usual OLS t
statistics to be invalid (that is, not to have t-distributions under H0)?
(ii) A sample correlation coeﬃcient of .95 between two independent variables that
are in the model.
(iii) Omitting an important explanatory variable.
Problem 5 (18 Points) The ﬁle CEOSAL2.RAW contains data on 177 chief
executive oﬃcers and can be used to examine the eﬀects of ﬁrm performance on
(i) Estimate a model relating annual salary to ﬁrm sales and market value. Make
the model of the constant elasticity variety for both independent variables.
Report the results in standard form. (8)
(ii) Add prof its to the model from part (i). Why can this variable not be included
in logarithmic form? Would you say that these ﬁrm performance variables
explain most of the variation in CEO salaries? (3)
(iii) Add the variable ceoten to the model in part (ii). What is the estimated
percentage return for another year of CEO tenure, holding other factors ﬁxed?
(iv) Find the sample correlation coeﬃcient between the variables log(mktval) and
prof its. Are these variables highly correlated? What does this say about the
OLS estimators? (5)
Problem 6 (7 Points) Consider an equation to explain salaries of CEOs in
terms of annual ﬁrm sales, return on equity (roe, in percent form), and return on
the ﬁrm’s stock (ros, in percent form):
log(salary) = β0 + β1log(sales) + β2roe + β3ros + u.
(i) In terms of the model parameters, state the null hypothesis that, after control-
ling for sales and roe, ros has no eﬀect on CEO salary. State the alternative
that better stock market performance increases a CEO’s salary. (2)
(ii) Using the data in CEOSAL1.RAW, the following equation was obtained by
log(salary) = 4.32 + 0.280 log(sales) + .0174 roe + 0.00024 ros
(.32) (.035) (.0041) (.00054)
n = 209 , R2 = .283.
By what percentage is salary predicted to increase if ros increases by 80
(iii) Test the null hypothesis that ros has no eﬀect on salary against the alternative
that ros has a positive eﬀect. Carry out the test at the 5% signiﬁcance level.
Problem 7 (12 Points) The following model can be used to study wether
campaign expenditures aﬀect election outcomes:
voteA = β0 + β1log(expendA) + β2log(expendB)
+ β3prtystrA + u,
where voteA is the percentage of the vote received by Candidate A, expendA and
expendB are campaign expenditures by Canditates A and B, and prtystrA is a
measure of party strength for Candidate A (the percentage of the most recent pres-
idential vote that went to A’s party).
(i) In terms of the parameters, state the null hypothesis that a 1% increase in A’s
expenditures is oﬀset by a 1% increase in B’s expenditure. (2)
(ii) Estimate a model using the data in VOTE1.RAW that directly gives the t
statistic for testing the hypothesis in part (ii). What do you conclude? (Use
a two-sided alternative.) (10)
Problem 8 (20 Points) Consider a model where the return to education depends
on the amount of work experience (and vice versa):
log(wage) = β0 + β1educ + β2exper + β3educ · exper + u.
(i) State the null hypothesis that the return to education does not depend on the
level of exper. (1)
(ii) Use the data in WAGE2.RAW to test the hypothesis in (i) against your stated
alternative. Carry out the test at a 5% signiﬁcance level. (3)
(iii) Let θ1 denote the return to education (in decimal form), when exper = 10 :
θ1 = β1 + 10β3. Obtain θˆ1 and a 95% conﬁdence interval for θ1. (8)
(iv) Now estimate the model
log(wage) = β0 + β1educ + β2exper + β3tenure + β4married
+ β5black + β6south + β7urban + u.
Holding other factors ﬁxed, what is the approximate diﬀerence in monthly
salary between blacks and nonblacks? (2)
(v) Extend the model from part (iv) to allow wages to diﬀer across four groups of
people: married and black, married and nonblack, single and black, and single
and nonblack. What is the estimated wage diﬀerential between married blacks
and married nonblacks? (6)
Problem 9 (3 Points) Which of the following are consequences of heteroskedas-
(i) The OLS estimators, βˆj, are inconsistent.
(ii) The usual F statistic no longer has an F distribution.
(iii) The OLS estimators are no longer BLUE.
Problem 10 (3 Points) Consider a linear model to explain monthly beer con-
beer = β0 + β1inc + β2price + β3educ + β4f emale + u
E(u|inc, price, educ, f emale) = 0
V ar(u|inc, price, educ, f emale) = σ2price
Write the transformed equation that has a homoskedastic error term.
Problem 11 (12 Points) Use the data in VOTE1.RAW for this problem.
Compute the Breusch-Pagan test for heteroskedasticity in a model with voteA as
the dependent variable and prtystrA, democA, log(expendA), and log(expendB) as
independent variables. Use the F statistic version. Estimate a regression model
that directly gives the F statistic. What are the null and the alternative hypotheses
in this F test. Is there evidence for heteroskedasticity at a 10% signiﬁcance level?
Problem 12 (7 Points)
Use the data in HPRICE1.RAW for this problem.
Consider the model
log(price) = β0 + β1log(lotsize) + β2log(sqrf t) + β3bdrms + u.
Obtain the heteroskedasticity-robust standard errors for this equation. Discuss any
important diﬀerences with the usual standard errors. What do you suggest about
heteroskedasticity in this model?
Problem 13 (10 Points) Use the data in CEOSAL1.RAW for this problem.
Consider the following model to explain salaries of CEOs in terms of annual ﬁrm
sales, return on equity (roe) and a dummy variable, rospos, which is equal to one
if ros > 0 and equal to zero if ros ≤ 0 (ros: return on the ﬁrm’s stock, in percent
log(salary) = β0 + β1log(sales) + β2roe + β3rospos + u.
Generate the dummy variable rospos and apply the RESET to this model. State
the unrestricted test equation, the null, and the alternative hypothesis of functional
form misspeciﬁcation. Is there evidence of functional form misspeciﬁcation?
Problem 14 (4 Points) Decide if you agree or disagree with each of the
following statements :
(i) Like cross-sectional observations, we can assume that most time series obser-
vations are independently distributed.
(ii) The OLS estimator in a time series regression is unbiased under the ﬁrst three
(iii) A trending variable cannot be used as the dependent variable in multiple
(iv) Seasonality is not an issue when using annual time series observations.