|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Attempt to summarize how to avoid a forbidden regression with -IVREG2-, and some questions to ask
From |
Kelvin Tan <[email protected]> |
To |
[email protected] |
Subject |
Re: st: Attempt to summarize how to avoid a forbidden regression with -IVREG2-, and some questions to ask |
Date |
Sat, 6 Feb 2010 08:17:15 +1000 |
Hi All,
Sorry for posting this message again as I am not sure if this mesage
has been properly posted.
Thanks for Austin's advice. I would like to ask another question in
regard to Weak identification test (Kleibergen-Paap rk Wald F
statistic).
I tried Method 1 as follows,
*----Begin Code---------------
regress y2 on all excluded Instruments, included instruments from y1
equation, time dummies ( but excluding y1 endogeneous dependent
variable).
predict y2hat, xb
gen y2hatsquared=y2hat^2
*xlist is a list of predictors for y1
xi:ivreg2 y1 (y2 y2^2 = excluded instruments y2hatsquared) xlist,
cluster(id) gmm2s endog(y2 y2^2)
*------End Code ------------------------
Is Kleibergen-Paap rk Wald F statistic valid to test for weak
instruments as there are two endogenous variables in the equation (y2
and y2^2)?
Can anyone advise me what to do next if Kleibergen-Paap rk Wald F
statistic=6.298 (see Results 1) is smaller than 5% maximal IV relative
bias of 11.04? Should I repeat this analysis with LIML or CUE
estimator (see Results 2) as they are more robust to weak instruments.
Results 1:
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 13.633
Chi-sq(3) P-val = 0.0035
------------------------------------------------------------------------------
Weak identification test (Kleibergen-Paap rk Wald F statistic): 6.298
Stock-Yogo weak ID test critical values:
5% maximal IV relative bias 11.04
10% maximal IV relative bias 7.56
20% maximal IV relative bias 5.57
30% maximal IV relative bias 4.73
10% maximal IV size 16.87
15% maximal IV size 9.93
20% maximal IV size 7.54
25% maximal IV size 6.28
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.968
Chi-sq(2) P-val = 0.6163
-endog- option:
Endogeneity test of endogenous regressors: 9.223
Chi-sq(2) P-val = 0.0099
Regressors tested: y2 y2^2
-----------------------------------------------------------------
Results 2:
based on LIML or CUE:
------------------------------------------------------------------------------
Weak identification test (Kleibergen-Paap rk Wald F statistic): 6.298
Stock-Yogo weak ID test critical values: 10% maximal LIML size 4.72
15% maximal LIML size 3.39
20% maximal LIML size 2.99
25% maximal LIML size 2.79
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
I look forward to hearing from you all.
Regards,
Kelvin
On Wed, Jan 27, 2010 at 11:20 PM, Austin Nichols
<[email protected]> wrote:
> Kelvin Tan <[email protected]>:
> There are two main issues: weak instruments and small violations of
> the exclusion restriction. If your instruments are weak, including
> more nearly irrelevant instruments can result in even worse inference,
> but method 1 introduces less of a problem in some sense--on the other
> hand method 2 estimates the first-stage coefs on all those additional
> excluded instruments, and you can do more overid tests; you can use
> liml to get inference more robust to the many weak instruments
> problem. The exclusion restriction has to be satisfied very
> strongly--that is, turn squared in your example should have no
> correlation with the true error term. Even a weak violation
> (correlation close to zero but not exactly zero) can produce very bad
> outcomes for inference. Running a simulation with data like yours
> (not the auto data) will clarify the importance of these tradeoffs for
> your particular case.
>
> On Wed, Jan 27, 2010 at 12:50 AM, Kelvin Tan
> <[email protected]> wrote:
>> Hi All,
>>
>> Having read the following two posts:
>> http://www.stata.com/statalist/archive/2003-11/msg00795.html
>> http://www.stata.com/statalist/archive/2005-05/msg00158.html
>>
>>
>> I would like to attempt to summarize the methods that Wooldridge
>> (2000) suggested to avoid the forbidden regression, so
>> please feel free to correct me. At the end, I would also like to ask a
>> couple of questions about these methods, hopefully I can get some
>> feedback from Stata Users.
>>
>> Wooldridge (2000), Econometric Analysis of Cross Section and Panel Data,
>> section 9.5, esp. pp. 236-7.
>>
>>
>> sysuse auto.dta, clear
>> gen weight2=weight^2
>>
>> We are trying to estimate the following two equations:
>> weight = constant + price + turn + length + gear_ratio +mpg
>> price = constant + weight + weight^2 + turn + displacement
>>
>> First method ----- Create an instrumental variable – weighthatsquared
>> – and use this as an additional instrument in ivreg2
>>
>> * ------------------Begin code for First Method
>> -----------------------------------
>> regress weight turn length gear_ratio mpg turn displacement
>> predict weighthat, xb
>> gen weighthatsquared=weighthat^2
>> ivreg2 price (weight weight2=weighthatsquared length gear_ratio mpg)
>> turn displacement , endog(weight weight2) gmm2s robust
>> *-------------------- End code for First Method
>> -----------------------------------
>>
>> Second method -- Create additional excluded instruments (cross-product
>> & and square of the excluded instruments) and use all these
>> instruments in ivreg2
>>
>> *------------------- Begin code for Second Method
>> -----------------------------------
>> gen length2=length^2
>> gen gear_ratio2=gear_ratio^2
>> gen mpg2=mpg^2
>> gen lengthmpg=length*mpg
>> gen mpggear_ratio= mpg*gear_ratio
>> gen lengthgear_ratio=length*gear_ratio
>> ivreg2 price (weight weight2= length gear_ratio mpg length2
>> gear_ratio2 mpg2 lengthmpg mpggear_ratio lengthgear_ratio) turn
>> displacement , endog(weight weight2) gmm2s robust
>> *-------------------- End code for Second Method
>> -----------------------------------
>>
>>
>> Question 1:
>> Can we use the following instruments for the second method: turn^2,
>> displacement^2 , cross product of (turn, displacement) with (length,
>> gear_ratio, mpg)? If yes, how many of them and what sort of
>> combinations should we use? Product of any two instruments, or three
>> instruments?
>>
>> Question 2:
>> Which is a preferred method (method 1 VS 2)? Any differences between
>> these two methods?
>>
>> Question 3:
>> What if we have year dummies in the price equation, is following
>> estimation method right?
>>
>> price = constant + weight + weight^2 + turn + displacement + year dummies
>> xi: ivreg2 price (weight weight2=weighthatsquared length gear_ratio
>> mpg) turn displacement , endog(weight weight2) i.year, gmm2s robust
>>
>>
>>
>>
>> Regards,
>> Kelvin Tan
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/