Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: ivreg2 and xtoverid error
From
John Antonakis <[email protected]>
To
[email protected]
Subject
Re: st: RE: ivreg2 and xtoverid error
Date
Sun, 04 Apr 2010 12:27:44 +0200
Hi Kit and Mark:
On another note, I was thinking more about the power issue and I had an
idea. One way to get around this problem with too many dummy independent
variables is to use Mundlak's trick to model fixed effects:
Mundlak, Y. (1978). Pooling of Time-Series and Cross-Section Data.
Econometrica, 46(1), 69-85.
That is, taking the cluster mean of the endogenous regressors and using
that as the instrument instead of the fixed effects gives almost exactly
the same result. Here they are (for a larger sample and I chucked in
there two more instruments to overidentify the equation--and here I have
plenty of power)--and the stats look fine:
. xi: ivreg2 y (x1-x13 = mean_x1-mean_x13 z1 z2) , cluster(lead_n)
endog(iia-lf em_new-om_new)
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on lead_number
Number of clusters (lead_number) = 345 Number of obs
= 2616
F( 13, 344) =
80.76
Prob > F =
0.0000
Total (centered) SS = 1870.655581 Centered R2 =
0.6287
Total (uncentered) SS = 25133.5 Uncentered R2 =
0.9724
Residual SS = 694.5273064 Root MSE =
.5153
------------------------------------------------------------------------------
| Robust
y | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
x1 | .4514196 .0571024 7.91 0.000 .3395009
.5633382
x2 | -.094049 .058022 -1.62 0.105 -.2077699
.019672
x3 | -.0285798 .0428113 -0.67 0.504 -.1124884
.0553289
x4 | .0784366 .0525558 1.49 0.136 -.0245709
.181444
x5 | .0810408 .0509977 1.59 0.112 -.0189129
.1809944
x6 | .148615 .0593743 2.50 0.012 .0322436
.2649865
x7 | -.1724113 .0287107 -6.01 0.000 -.2286832
-.1161394
x8 | .080845 .0375199 2.15 0.031 .0073073
.1543827
x9 | -.1967606 .0653847 -3.01 0.003 -.3249123
-.0686088
x10 | .1361337 .0557109 2.44 0.015 .0269424
.245325
x11 | .1213476 .0483033 2.51 0.012 .0266748
.2160204
x12 | .1441845 .0439615 3.28 0.001 .0580214
.2303475
x13 | .0320405 .0391324 0.82 0.413 -.0446575
.1087385
_cons | .5197885 .1609105 3.23 0.001 .2044097
.8351672
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):
90.529
Chi-sq(3) P-val =
0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):
26.380
(Kleibergen-Paap rk Wald F statistic):
3037.437
Stock-Yogo weak ID test critical values: <not
available>
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):
3.070
Chi-sq(2) P-val =
0.2154
-endog- option:
Endogeneity test of endogenous regressors:
27.751
Chi-sq(13) P-val =
0.0098
The results are the same when using my old estimation procedure (so it
is probably rounding that explains the very slight differences):
. xi: ivreg2 sat (iia-lf em_new-om_new=i.lead_n) if e(sample), noid
cluster(lead_n)
i.lead_number _Ilead_numb_1-484 (naturally coded; _Ilead_numb_1
omitted)
Warning - collinearities detected
Vars dropped: [snipped]
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on lead_number
Number of clusters (lead_number) = 345 Number of obs
= 2616
F( 13, 344) =
80.99
Prob > F =
0.0000
Total (centered) SS = 1870.655581 Centered R2 =
0.6289
Total (uncentered) SS = 25133.5 Uncentered R2 =
0.9724
Residual SS = 694.2693929 Root MSE =
.5152
------------------------------------------------------------------------------
| Robust
y | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
x1 | .4503576 .0572067 7.87 0.000 .3382346
.5624806
x2 | -.096804 .0580199 -1.67 0.095 -.2105209
.016913
x3 | -.0242708 .0428203 -0.57 0.571 -.108197
.0596553
x4 | .0851285 .0524755 1.62 0.105 -.0177215
.1879785
x5 | .076775 .0506549 1.52 0.130 -.0225068
.1760568
x6 | .1495938 .0591696 2.53 0.011 .0336235
.265564
x7 | -.1706418 .0287424 -5.94 0.000 -.2269759
-.1143076
x8 | .0777304 .0375514 2.07 0.038 .0041309
.1513299
x9 | -.1975138 .0649414 -3.04 0.002 -.3247966
-.070231
x10 | .1376252 .0556762 2.47 0.013 .0285017
.2467486
x11 | .1159876 .0473009 2.45 0.014 .0232795
.2086957
x12 | .1450098 .0438989 3.30 0.001 .0589695
.23105
x13 | .0325769 .0387352 0.84 0.400 -.0433428
.1084965
_cons | .5153171 .1612287 3.20 0.001 .1993147
.8313194
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):
330.961
Chi-sq(331) P-val =
0.4903
Best,
J.
____________________________________________________
Prof. John Antonakis, Associate Dean
Faculty of Business and Economics
Department of Organizational Behavior
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland
Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305
Faculty page:
http://www.hec.unil.ch/people/jantonakis
Personal page:
http://www.hec.unil.ch/jantonakis
____________________________________________________
On 03.04.2010 22:46, John Antonakis wrote:
> Thank Kit.
>
> One small bit of evidence for the fact that the fixed effects don't
correlate with the error might come from the -xtoverid- test for random
vs fixed effects. The classic interpretation of the test is that if it
is significant, it suggests that the endogenous regressors correlate
with y when the fixed-effects are not included. However, and equally so,
if the test is significant, it means too that the fixed-effects
correlate with y when controlling for the endogenous regressors (the
fixed-effects correlate with the residual variance of y when controlling
for the endogenous regressors). This test is akin to a mediation test as
follows, where x is the endogenous regressor and z is exogenous:
>
> 1. regress y on x (obtain significant coefficient)
> 2. regress y on z (obtain significant coefficient)
> 3. regress y on x and z (obtain significant coefficient only for x)
>
> If in step 3 the coefficient of z becomes non-significant (when it
was significant before), then we have evidence of mediation--that is,
that the correlation of x with y is stronger than that of z with y while
controlling for the relation of x and z. The -xtoverid- test does an
analogous thing: if it is non-significant we know that the endogenous
regressors account for all the variance in y and that instruments don't
correlate with y when controlling for the regressors; thus as an
exogenous instrument, it should not correlate with the residual. I got
the Sargan-Hansen statistic from the -xtoverid- 12.979 Chi-sq(13)
P-value = 0.4494.
>
> Also, I estimated the following fixed-effects model, a direct analog
of the above mediation effect model :
>
> reg y x1-x13 i.lead_num, cluster(lead_num)
> est store fe
> reg y x1-x13, cluster(lead_num)
> hausman fe, force
>
> This test is non-significant too (though I should not be using the
Hausman test with a robust estimator). Thus controlling for the
endogenous variable, the fixed-effects do not correlate with y. I hope
that what I have said makes sense.
>
> Also, concerning the power issue, on one hand, with more instruments
the model has more ways to go wrong so ceteris paribus, power to detect
misspecification goes up with more degrees of freedom, correct? On the
other hand, with weak instruments the power of the test is reduced. I
guess a simulation would be needed to settle this.
>
> Anyway, you are right in that it is possible that my instruments are
weak and thus introduce bias. I have taken note of this limitation. I
actually have direct measures of the leader's ability, personality, and
other things, though I am saving them for another publication. I will
check though to see what they give too in comparison to the
fixed-effects instruments.
>
> Best regards,
> John.
>
> ____________________________________________________
>
> Prof. John Antonakis, Associate Dean Faculty of Business and Economics
> Department of Organizational Behavior
> University of Lausanne
> Internef #618
> CH-1015 Lausanne-Dorigny
> Switzerland
>
> Tel ++41 (0)21 692-3438
> Fax ++41 (0)21 692-3305
>
> Faculty page:
> http://www.hec.unil.ch/people/jantonakis
>
> Personal page:
> http://www.hec.unil.ch/jantonakis
> ____________________________________________________
>
>
>
> On 03.04.2010 17:14, Kit Baum wrote:
>> <>
>> John said
>>
>> I get exactly the same estimates and standard errors with -ivreg-
and -ivregress-, with the cluster robust variance estimator. When using
-ivreg2- with the -noid- option it works and I get the same estimates;
more importantly, I also get the Hansen J-test, which is what interests
me most (the -ivregress- estimator does not report an overid for
cluster-robust vce's):
>>
>> Hansen J statistic (overidentification test of all instruments):
402.476, Chi-sq(404) P-val = 0.5121
>>
>> The one thing to worry about here is that which arises with
Sargan-Hansen tests after xtabond or user-written xtabond2: the overid
test may not have much power when confronted with hundreds of instruments.
>> You also mention the test provided by 'estat endogenous', which
could be done in ivreg2 via the endog() option. This Durbin-Wu-Hausman
test is merely telling you that you shouldn't use OLS on this model. But
you're probably convinced of that in any event. Rejecting OLS as
inconsistent does not imply that IV is consistent; that depends on the
overid test of the excluded instruments (which you pass, but as
mentioned may have low power to detect a problem) and the proper
specification of the model. You might want to use ivreg2's orthog()
option to consider just the non-dummy instruments as a group, and check
to see that that Hansen "GMM distance" test also supports the notion
that those excluded instruments are suitably orthogonal to the error.
>>
>> Kit Baum | Boston College Economics & DIW Berlin |
http://ideas.repec.org/e/pba1.html
>> An Introduction to Stata Programming
| http://www.stata-press.com/books/isp.html
>> An Introduction to Modern Econometrics Using Stata |
http://www.stata-press.com/books/imeus.html
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/