Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: RE: Interpreting Kleibergen Paap weak instrument statistic |
Date | Mon, 25 Jun 2012 15:52:10 +0100 |
James, > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Fitzgerald, > James > Sent: 25 June 2012 14:53 > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: RE: Interpreting Kleibergen Paap weak instrument > statistic > > Mark, > > Thank you very much for your reply. > > I have a few follow-up questions that you might be able to help me > with. First though I thought it might be helpful if I gave a quick > synopsis of my research question. > > I am investigating the determinants of capital structure in UK Plcs, > and my main hypothesis is that the theories espoused in the extant > literature are only applicable to certain types of firms. > As such, I divide my sample into sub-samples based on certain firm > characteristics i.e. size, tangibility of assets etc., and compare > regressor coefficients across the sub-samples. I'm not sure I understand. Do you estimate separately for the different subsamples, or do you interact your coefficients with indicator variables and estimate one big regression? > However, I was initially worried that such a categorisation procedure > might introduce endogeneity issues that might vary across sub-samples, > and thus I would not be able to reliably compare coefficients across > sub-samples. Hence I decided to employ instrumental variables (lagged > independent variables) to over come such issues. Within each > sub-sample I test the orthogonality assumption of my included > regressors (on an individual basis) using the orthog option in > xtivreg2. Any variables I find to be potentially endogenous (C-stat > p-value > <0.100) are then instrumented where instruments are available. > I am currently unaware of any method to correctly test the i.i.d. > assumption using xtivreg2, and so I have decided to drop the > assumption, and hence my question with regards the KP stat. > > With regards to your earlier reply, the following are some follow up > questions I still have. > > 1. Is there an option in ivreg2 to test the i.i.d. > assumption, and if not, how would i go about testing same? This amounts to testing for heteroskedasticity or autocorrelation. -ivhettest- and -ivactest- will report such tests for IV models. But you are using a fixed effects model, which complicates things a bit. How long is your T dimension? I see from the estimation below that you are using a kernel-robust VCE, which implies T is biggish. If so, you could apply the fixed effects transformation to your data by hand (e.g., using Ben Jann's -center- command) and then use these programs. But this is a bit tricky. The simplest way to test the i.i.d. assumption is to do an eyeball version of a White-type test. Estimate the model using kernel-robust VCEs, and then again without this option, i.e., using the classical VCE. Do the SEs look very different? If so, it's likely that the i.i.d. assumption would fail if you tested formally using a White-type test, since the same principle is involved - the test stat is based on a vector of contrasts between the robust and classical VCEs. > 2. With regards to the Anderson-Rubin statistic and the Stock-Wright > LM S statistic, both of which are reported by xtivreg2, am I correct > in my interpretation that given that they both test the joint > hypotheses of weak instruments and orthogonality, the statistics are > only interpretable from a weak instruments perspective as long as the > Hansen J test of all excluded instruments indicates orthogonality > conditions are valid? Sort of ... it's a litte more complicated than that. I recommend reading the Finlay-Magnusson paper on this. > 3.Included below is the first stage regression results from one of the > tests I run. Maybe I am misreading the output, but it looks like only the summary stats for the first stage are reported. > As you can see the Cragg Donald and > Kleibergen Paap stats both suggest that the instruments are not weak. > However, the AR and SW stats suggest that the instruments, given that > the Hansen J-test does not reject the null, are potentially weak. No, that's a misintepretation of the AR and SW tests. See below. > From the output these stats > appear to me to be testing the explanatory power of the instrument > rather than whether or not it is weak Neither. These are not tests of the strength or explanatory power of the IV. They are just what the output says: tests of the significance of the endogenous regressor. Your endogenous regressor is liq. In the main output, the coeff on liq is -.0085538, with a z-stat of -1.73 and a p-value of 0.084. That is, the Wald test stat for the null that the coeff on liq=0 has a p-value of 0.084. The A-R test stat (F version) for the same hypothesis, i.e., B1=0, augmented by the additional hypothesis that the IVs are exogenous, has a p-value of 0.0607. Very similar. The A-R-type approach can be extended to generate weak-instrument-robust confidence intervals. That's what Finlay & Magnusson's -rivtest- will do for you. HTH, Mark > i.e. > > Weak-instrument-robust inference > Tests of joint significance of endogenous regressors B1 in main > equation > Ho: B1=0 and orthogonality conditions are valid > > The coefficient significance level of the instrumented variable (liq) > is relatively low (p-value = 0.084), but the instrument does not > appear to be weak (based on CD and KP stats). However, I would > conclude that it potentially is weak based on the AR and SW stats. > Is my interpretation incorrect, and if so could you indicate how these > stats ought to be interpreted? > > I greatly appreciate any help you can offer > > Best regards > > James > > Summary results for first-stage regressions > > (Underid) > (Weak id) > Variable F( 4, 2541) P-val AP Chi-sq( 4) P-val AP > F( 4, 2541) > liq 20.20 0.0000 81.78 0.0000 20.20 > > NB: first-stage test statistics heteroskedasticity and > autocorrelation-robust > > Stock-Yogo weak ID test critical values for single endogenous > regressor: > 5% maximal IV relative bias 16.85 > 10% maximal IV relative bias 10.27 > 20% maximal IV relative bias 6.71 > 30% maximal IV relative bias 5.34 > 10% maximal IV size 24.58 > 15% maximal IV size 13.96 > 20% maximal IV size 10.26 > 25% maximal IV size 8.31 > Source: Stock-Yogo (2005). Reproduced by permission. > NB: Critical values are for Cragg-Donald F statistic and i.i.d. > errors. > > Underidentification test > Ho: matrix of reduced form coefficients has rank=K1-1 > (underidentified) > Ha: matrix has rank=K1 (identified) > Kleibergen-Paap rk LM statistic Chi-sq(4)=58.30 > P-val=0.0000 > > Weak identification test > Ho: equation is weakly identified > Cragg-Donald Wald F statistic > 78.65 > Kleibergen-Paap Wald rk F statistic > 20.20 > Stock-Yogo weak ID test critical values for K1=1 and L1=4: > 5% maximal IV relative bias 16.85 > 10% maximal IV relative bias 10.27 > 20% maximal IV relative bias 6.71 > 30% maximal IV relative bias 5.34 > 10% maximal IV size 24.58 > 15% maximal IV size 13.96 > 20% maximal IV size 10.26 > 25% maximal IV size 8.31 > Source: Stock-Yogo (2005). Reproduced by permission. > NB: Critical values are for Cragg-Donald F statistic and i.i.d. > errors. > > Weak-instrument-robust inference > Tests of joint significance of endogenous regressors B1 in main > equation > Ho: B1=0 and orthogonality conditions are valid > Anderson-Rubin Wald test F(4,2541)= 2.26 > P-val=0.0607 > Anderson-Rubin Wald test Chi-sq(4)= 9.14 > P-val=0.0577 > Stock-Wright LM S statistic Chi-sq(4)= 9.22 > P-val=0.0557 > NB: Underidentification, weak identification and > weak-identification-robust test statistics heteroskedasticity and > autocorrelation-robust > > Number of observations N = 3021 > Number of regressors K = 28 > Number of endogenous regressors K1 = 1 > Number of instruments L = 31 > Number of excluded instruments L1 = 4 > 2-Step GMM estimation > > Estimates efficient for arbitrary heteroskedasticity and > autocorrelation Statistics robust to heteroskedasticity and > autocorrelation kernel=Bartlett; bandwidth=2 time variable (t): year > group variable (i): firm > Number of obs = 3021 > F( 28, 2544) = 3.02 > Prob > F = 0.0000 > Total (centered) SS = 21.06783592 > Centered R2 = 0.0261 > Total (uncentered) SS = 21.06783592 > Uncentered R2 = 0.0261 > Residual SS = 20.51803233 Root > MSE = .08932 > > Robust > ltdbv Coef. Std. Err. z P>z [95% Conf. Interval] > liq -.0085538 .0049465 -1.73 0.084 -.0182487 .0011411 > lnsale .0053743 .0052578 1.02 0.307 -.0049307 > .0156794 > tang .1170177 .0610377 1.92 0.055 -.0026139 .2366493 > itang .0557467 .0239463 2.33 0.020 .0088127 .1026806 > itangdum .0123551 .0065003 1.90 0.057 -.0003853 > .0250955 > tax -.0193497 .00924 -2.09 0.036 -.0374598 -.0012396 > prof .0025405 .0027681 0.92 0.359 -.0028849 .0079659 > mtb -.0019451 .0019992 -0.97 0.331 -.0058635 .0019733 > capexsa .0108254 .0087886 1.23 0.218 -.0064 > .0280507 > ndts -.0022495 .0032416 -0.69 0.488 -.008603 .004104 > yr90 -.0860865 .1693451 -0.51 0.611 -.4179968 .2458238 > yr91 -.0057954 .0156291 -0.37 0.711 -.036428 .0248371 > yr92 .0060493 .0148008 0.41 0.683 -.0229596 .0350583 > yr93 -.0066494 .0154936 -0.43 0.668 -.0370163 .0237174 > yr94 -.0038801 .0137634 -0.28 0.778 -.0308559 .0230956 > yr95 -.0021814 .0139629 -0.16 0.876 -.0295482 .0251854 > yr96 .007044 .0137418 0.51 0.608 -.0198895 .0339775 > yr97 .0119441 .0134385 0.89 0.374 -.0143949 .0382831 > yr98 .0069794 .013185 0.53 0.597 -.0188627 .0328216 > yr99 .0132963 .0125952 1.06 0.291 -.0113898 .0379825 > yr00 .0080221 .0119826 0.67 0.503 -.0154633 .0315074 > yr01 -.0000815 .0107388 -0.01 0.994 -.0211291 .0209661 > yr02 .0001449 .0106504 0.01 0.989 -.0207295 .0210193 > yr03 .0106314 .0115621 0.92 0.358 -.0120299 .0332926 > yr04 .0097052 .0102908 0.94 0.346 -.0104643 .0298748 > yr05 .0156916 .0108831 1.44 0.149 -.0056388 .0370221 > yr06 .0093837 .0108831 0.86 0.389 -.0119467 .0307142 > yr07 .005672 .0086985 0.65 0.514 -.0113768 .0227207 > > Underidentification test (Kleibergen-Paap rk LM statistic): > 58.301 > Chi-sq(4) P-val = 0.0000 > > Weak identification test (Cragg-Donald Wald F statistic): > 78.647 > (Kleibergen-Paap rk Wald F statistic): 20.198 > Stock-Yogo weak ID test critical values: 5% maximal IV > relative bias 16.85 > 10% maximal IV relative bias 10.27 > 20% maximal IV relative bias 6.71 > 30% maximal IV relative bias 5.34 > 10% maximal IV size 24.58 > 15% maximal IV size 13.96 > 20% maximal IV size 10.26 > 25% maximal IV size 8.31 > Source: Stock-Yogo (2005). Reproduced by permission. > NB: Critical values are for Cragg-Donald F statistic and i.i.d. > errors. > > Hansen J statistic (overidentification test of all > instruments): 5.596 > Chi-sq(3) P-val = 0.1330 > Instrumented: liq > Included instruments: lnsale tang itang itangdum tax prof mtb capexsa > ndts yr90 > yr91 yr92 yr93 yr94 yr95 yr96 yr97 yr98 yr99 yr00 yr01 > yr02 yr03 yr04 yr05 yr06 yr07 > Excluded instruments: tang1 itang1 mtb1 liq1 > Dropped collinear: yr08 > > . > > ________________________________________ > From: owner-statalist@hsphsun2.harvard.edu > [owner-statalist@hsphsun2.harvard.edu] on behalf of Schaffer, Mark E > [M.E.Schaffer@hw.ac.uk] > Sent: 25 June 2012 12:33 > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: Interpreting Kleibergen Paap weak instrument > statistic > > James, > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > > Fitzgerald, James > > Sent: 21 June 2012 14:02 > > To: statalist@hsphsun2.harvard.edu > > Subject: st: Interpreting Kleibergen Paap weak instrument statistic > > > > Hi Statalist users > > > > I am using xtivreg2 to estimate a GMM-IV model (I specify the > > following options; fe robust bw(2) gmm2s). I am not assuming i.i.d > > errors, and thus when testing for weak instruments I am using the > > Kleibergen Paap rk wald F statistic rather than the Cragg Donald > > wald F statistic. > > > > xtivreg2 produces Stock-Yogo critical values for the Cragg Donald > > statistic assuming i.i.d errors, so I'm not sure how to interpret > > the KP rk wald F stat. > > > > The help file for ivreg2 (Baum, Schaffer and Stillman, 2010) does > > however mention the following: > > > > When the i.i.d. assumption is dropped and ivreg2 is invoked with the > > robust, bw or cluster options, the Cragg-Donald-based weak > > instruments test is no longer valid. > > ivreg2 instead reports a correspondingly-robust Kleibergen-Paap Wald > > rk F statistic. The degrees of freedom adjustment for the rk > > statistic is (N-L)/L1, as with the Cragg-Donald F statistic, except > > in the cluster-robust case, when the adjustment is N/(N-1) * > > (N_clust-1)/N_clust, following the standard Stata small-sample > > adjustment for cluster-robust. In the case of two-way clustering, > > N_clust is the minimum of N_clust1 and N_clust2. The critical > > values reported by ivreg2 for the Kleibergen-Paap statistic are the > > Stock-Yogo critical values for the Cragg-Donald i.i.d. case. > > The critical values reported with 2-step GMM are the Stock-Yogo IV > > critical values, and the critical values reported with CUE are the > > LIML critical values. > > > > > > My understanding of the end of the paragraph is that the KP stat can > > still be compared to the Stock-Yogo values produced by STATA in > > determining whether or not instruments are weak. > > > > If someone could confirm or reject this I would be eternally > > grateful!! > > I wrote that paragraph, so the ambiguity is partly my fault. But the > problem is that there are no concrete results in the literature for > testing for weak IVs when the i.i.d. assumption fails. The only thing > one can do (that I'm aware of, anyway) is to point to stats that have > an asymptotic justification in a test of underidentification, which is > what the output of -ivreg2- does. That is, the K-P stat can be used > to test for underidentification without the i.i.d. assumption, and > under i.i.d. > it has the same distribution under the null as the Cragg-Donald stat. > This justification is different from that underlying the Stock-Yogo > critical values, so this is pretty hand-wavey. > > The alternative is weak-instrument-robust estimation, a la > Anderson-Rubin, Moreira, Kleibergen, etc. The Finlay-Magnusson > -rivtest- command, available via ssc ideas in the usual way, supports > this. Also see their accompanying SJ paper (vol. 9 no. 3). > The command > doesn't directly support panel data estimation, which is what you > have, but you could just demean your variables by hand. > > HTH, > Mark > > > > Best wishes > > > > James Fitzgerald > > * > > * For searches and help try: > > * http://www.stata.com/help.cgi?search > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ -- Heriot-Watt University is the Sunday Times Scottish University of the Year 2011-2012 Heriot-Watt University is a Scottish charity registered under charity number SC000278. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/