Homework Assignment #1 (due Jan. 14)
Written problems:
1. Consider the simple linear regression model without an intercept, y = ß1x + u, with
the assumption E(u|x)=0. Also assume that E(x)=0
a. Show that E(y)=0 and using this as well as E(x)=0 show that the covariance
between x and y is given by E(xy) and that the variance of x is given by E(x2 )
b. Using the results in a. and the assumptions above show that E(xu)=0 and hence
that
ß1 = E(xy) / E(x2 )
c. What is the Least Squares estimate of ß1 (say 1
~ß
) in this model without an
intercept? How does it relate to the population value ß1 ? Do they have to be the
same – Why? Why not?
2. Wooldridge 2.4 (i.e. Chapter 2 Question 4)
3. Wooldridge 2.6
Computer problems (show any relevant Stata output):
1. Wooldridge C2.4 (i.e. Chapter 2 Question C4), parts (i) and (ii)
For the same dataset (WAGE2.DTA), also answer the following questions:
(i) Plot wage versus iq with the fitted regression line shown.
(scatter y x || lfit y x gives the scatter of y vs x with a fitted line.)
(ii) Verify that the regression slope estimate is equal to (i) the ratio between the
sample covariance (between wage and iq) and the variance of iq, and (ii) the
sample correlation (between wage and iq) times the ratio between the standard
deviations of the two variables.
(corr y x, covariance gives the covariance matrix for x and y, whereas the
corr command without the “covariance” option gives the correlation matrix.)
(iii) Form the fitted values for the dependent variable. (After the regression
command, do predict wagehat. This command will create a new variable
wagehat.) How does the sample average of wagehat compare to the sample
average of wage? What is the correlation between wagehat and iq? (Can you
figure these out without using Stata?)
(iv) Form the OLS residuals for this regression. (Do predict uhat, resid, which
will create a new variable uhat with the estimated residuals.) Verify that (a) the
sample average of uhat is equal to zero and (b) the correlation between uhat and
iq is equal to zero.
(v) Now do the reverse regression by reversing the roles of wage and iq (so that iq is
now the dependent variable and wage the independent variable).
a. What is the estimated slope for this regression?
b. Explain why it is not surprising to see the same sign for the slope as in the
original regression.
c. For which regression model do you think the zero conditional-mean
assumption on the error is more believable? You now have two models:
wage = ß0 + ß1iq + u, with the assumption E(u|iq)=0
iq = ?0 + ?1wage + v, with the assumption E(v|wage)=0
2. Wooldridge C2.6 (note that log(expend) is in the data as the lexpend variable)
For the same dataset (MEAP93.DTA), also answer the following questions:
(vi) Plot math10 versus lexpend with the fitted regression line shown.
(vii) To visualize the non-linearity described by this model, create the fitted values
from the regression and then do a scatter plot of the fitted values versus expend
(not lexpend). Explain how the effect of expenditures changes at higher values of
expenditures.
(viii) Suppose that we wanted to measure math10 as a fraction (a number between 0
and 1) rather than on a 0-to-100 scale. Specifically, we could do the following in
Stata to re-scale the math10 variable:
. replace math10 = math10 / 100
(This replaces the original math10 values with the values divided by 100.) If you
re-ran the regression (now regressing the re-scaled math10 upon lexpend), how
would the slope and intercept estimates change (compared to the original results)?
Be specific, and try these on your own before checking your answers in Stata.