Hi Stata Users,
I have a question on Wooldrige's Procedure 18.1 which relates to IV estimation when the endogenous regressor is a binary variable.
Suppose I want to estimate the following equation:
Y = a0 + a1 * X1 + a2* X2 + error
where X1 is an indicator variable and is also endogenous. Assume that we have exactly 1 instrument Z for X1.
X2 is an exogenous variable. The parameter of interest is a1.
Now, for a more efficient estimation it is often suggested to use the following two step estimation method:
Step 1: Estimate a Probit for the binary endogenous regressor on all exogenous variable and the instrument variable to obtain fitted probabilities.
Probit: X1 = f (Z, X2,) + error
Gives us predicted X1 : X1_hat1
Step 2: Use the fitted probabilities from (step 1 above) X1_hat1 and all exogenous regressors as instruments to obtain a more efficient estimate of the binary endogenous regressor. So in this step I use the standard 2sls procedure:
First stage: X1= b0 = b1*X1_hat1 + b2*X2 + error
This gives us the predicted value X1_hat2.
Second Stage: Y = a0 + a1 * X1_hat2 + a2* X2 + error
My question relates to the validity of the instrument Z. To argue for the validity of this instrument should I consider Z's statistical significance in the Probit model of Step 1?
Or
since step 1 is simply an extra step to obtain fitted probabilities to be used as instruments , we should only look at the statistical significance of the fitted values (X1_hat1) in the first stage of 2sls estimation in Step 2, i.e. whether or not b1=0 ?
Wooldrige claims that we can ignore the step 1 estimation properties and focus only on step 2. However he does not offer any explanation for the same. I will really appreciate any comments or suggestions you may have on this issue.
Thanks,
VB
The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. http://in.yahoo.com/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/