Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: First stage of panel IV
From
"Schaffer, Mark E" <[email protected]>
To
<[email protected]>
Subject
RE: st: RE: First stage of panel IV
Date
Sat, 16 Jun 2012 12:24:28 +0100
Hi Filippos. Sorry for the delay in replying. Some responses below:
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Filippos Petroulakis
> Sent: 13 June 2012 00:15
> To: [email protected]
> Subject: RE: st: RE: First stage of panel IV
>
> Hi Mark,
>
> 1) I have version 01.0.13 for xtivreg2, 03.0.08 for ivreg2,
> and 01.3.01 for ranktest
xtivreg2 is up-to-date but ivreg2 and ranktest are not.
I tried to replicate your problem (different first-stage results
reported by xtivreg2 and official xtivreg when there are singletons) but
couldn't - using the up-to-date versions, the first-stage and final
outputs of xtivreg2 and xtivreg match.
If updating doesn't solve your problem, can you contact me off-list and
we can try to work out what's going on?
> 2) I get the exact same results. In fact I run both and
> create a variable equal to e(sample) for each and they are identifcal
>
> 3) You write
>
> >It sounds like that's because the instrumenting of either w or q is
> >weak. But that's what the Angrist-Pischke F-stats are for.
> If the AP
> >F-stat for the regressor of interest is respectable, then you're OK.
>
> But I'm not instrumenting for them. The problem arises when I
> merely put them in the regression so they are included in the
> first stage of x on z and the other covariates (so that they
> become included instruments).
Apologies - in your previous posting you said that in
y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,
"w_it and q_it are correlated with x_it and with e_it"
so I thought you were instrumenting for them. But now I understand.
So ... if I understand correctly, what is happening is that the b_1, the
coeff on the endogenous regressor x_it, becomes weakly identified when
you include w and q as exogenous regressors.
I think what is happening is that the component of x that your excluded
instrument z is correlated with is also correlated with w and q.
If you think about it in a mechanical way, the weak ID diagnostic is the
first-stage F stat for the significance of x in the regression
x_it = b_k demographic_covariates + z_it + v_it
The F for a test of the significance of z in the above regression is
big, but when you add w and q,
x_it = b_k demographic_covariates + z_it + w_it + q_it + v_it
the F for the test of z becomes small. So, loosely speaking, a lot of
the ability of z to explain x disappears when it has to compete with w
and q. Presumably the SEs on w and q are on the small size.
I don't know what your application is, but it's possible that this could
be a case of what Angrist and Pischke ("Mostly Harmless Econometrics",
Princeton U.P. 2009, pp. 64-68) call the "bad control" problem. If so,
the solution is to omit w and q altogether.
HTH,
Mark
> Thanks again,
>
> Filippos
>
> >>> "Schaffer, Mark E" <[email protected]> 06/12/12 4:36 AM >>>
> Filippos,
>
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of Filippos
> > Petroulakis
> > Sent: 12 June 2012 04:11
> > To: [email protected]
> > Subject: Re: st: RE: First stage of panel IV
> >
> > Hi Mark and thanks for your response. I am using Stata 11
> and the up
> > to date version of xtivreg2.
>
> Can you check/report to us your versions of xtivreg2, ivreg2
> and ranktest?
>
> > I'll start another list
> > as it's getting too crowded
> >
> > 1) I was probably mistaken about this. Just to make sure,
> is running
> > OLS on a panel with all variables differenced identical to running
> > first differences?
>
> Should be the case.
>
> > 2) About xtivreg versus xtivreg2, I am certain and to make
> sure I run
> > xtivreg2 and then copy the code and then just remove the 2. For
> > xtivreg2 the first stage output is
> >
> >
> > .
> > . xtivreg2 y ( x =z) ///
> > l.bzo ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus
> > ln_fraction_male ln_pop_hisp_all
> > ln_pop_nh_black ln_pop_nh_white, fd small first robust
> >
> > FIRST DIFFERENCES ESTIMATION
> > ----------------------------
> > Number of groups = 1026 Obs per
> > group: min = 1
> >
> > avg = 1.9
> >
> > max = 2
> >
> > First-stage regressions
> > -----------------------
> >
> > First-stage regression of D.ln_stim_forfd:
> >
> > OLS estimation
> > --------------
> >
> > Estimates efficient for homoskedasticity only Statistics robust to
> > heteroskedasticity
> >
> > Number
> > of obs = 1928
> > F( 11,
> > 1916) = 53.82
> > Prob >
> > F = 0.0000
> > Total (centered) SS = 1631.878852
> > Centered R2 = 0.2035
> > Total (uncentered) SS = 59641.31821
> > Uncentered R2 = 0.9782
> > Residual SS = 1299.720342 Root
> > MSE = .8236
> >
> > --------------------------------------------------------------
> > ----------------
> > D. | Robust
> > x | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> > -------------+------------------------------------------------
> > ----------
> > -------------+------
> > ln_bzo |
> > LD. | -.1256429 .0767888 -1.64 0.102
> > -.2762413 .0249555
> > |
> > ln_tunder15 |
> > D1. | 5.698369 3.323788 1.71 0.087
> > -.8202531 12.21699
> > |
> > ln_t15to24 |
> > D1. | 6.831107 2.217364 3.08 0.002
> > 2.482406 11.17981
> > |
> > ln_t25to44 |
> > D1. | 32.90151 3.672888 8.96 0.000
> > 25.69823 40.10479
> > |
> > ln_t45to64 |
> > D1. | 8.227723 3.417776 2.41 0.016
> > 1.524771 14.93068
> > |
> > ln_t65plus |
> > D1. | 4.88607 2.705773 1.81 0.071
> > -.4205002 10.19264
> > |
> > ln_fractio~e |
> > D1. | -3.86913 9.62245 -0.40 0.688
> > -22.74071 15.00245
> > |
> > ln_pop_his~l |
> > D1. | -6.352615 1.729689 -3.67 0.000
> > -9.744887 -2.960343
> > |
> > ln_pop_nh_~k |
> > D1. | 7.426661 3.028961 2.45 0.014
> > 1.486254 13.36707
> > |
> > ln_pop_nh_~e |
> > D1. | -5.481275 3.728017 -1.47 0.142
> > -12.79267 1.830123
> > |
> > z |
> > D1. | 3.404714 .2067276 16.47 0.000
> > 2.999279 3.810149
> > |
> > _cons | 5.682521 .0508073 111.84 0.000
> > 5.582877 5.782164
> > --------------------------------------------------------------
> > ----------------
> >
> >
> > With xtivreg it is
> >
> >
> > . xtivreg y ( x =z) ///
> > l.ln_crime ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus
> > ln_fraction_male ln_pop_hisp_all
> > ln_pop_nh_black ln_pop_nh_white, fd small first
> >
> > First-stage first-differenced regression
> >
> > Source | SS df MS Number
> > of obs = 901
> > -------------+------------------------------ F( 11,
> > 889) = 17.42
> > Model | 58.4561519 11 5.31419563 Prob >
> > F = 0.0000
> > Residual | 271.136523 889 .304990465
> > R-squared = 0.1774
> > -------------+------------------------------ Adj
> > R-squared = 0.1672
> > Total | 329.592675 900 .366214084 Root
> > MSE = .55226
> >
> > --------------------------------------------------------------
> > ----------------
> > D. |
> > x | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> > -------------+------------------------------------------------
> > ----------
> > -------------+------
> > ln_bzo |
> > LD. | -.0553261 .0661387 -0.84 0.403
> > -.1851322 .0744801
> > |
> > ln_tunder15 |
> > D1. | 14.637 2.988974 4.90 0.000
> > 8.770729 20.50326
> > |
> > ln_t15to24 |
> > D1. | 12.06724 2.274707 5.30 0.000
> > 7.602818 16.53166
> > |
> > ln_t25to44 |
> > D1. | 28.67612 3.411086 8.41 0.000
> > 21.9814 35.37084
> > |
> > ln_t45to64 |
> > D1. | 9.750316 3.753647 2.60 0.010
> > 2.383274 17.11736
> > |
> > ln_t65plus |
> > D1. | 9.117776 2.449422 3.72 0.000
> > 4.310454 13.9251
> > |
> > ln_fractio~e |
> > D1. | 20.81802 9.297503 2.24 0.025
> > 2.570409 39.06564
> > |
> > ln_pop_his~l |
> > D1. | -3.244084 1.379805 -2.35 0.019
> > -5.95214 -.5360282
> > |
> > ln_pop_nh_~k |
> > D1. | -2.669608 2.386585 -1.12 0.264
> > -7.353605 2.01439
> > |
> > ln_pop_nh_~e |
> > D1. | -18.12162 4.160963 -4.36 0.000
> > -26.28808 -9.955169
> > |
> > z |
> > D1. | -1.102999 .2832308 -3.89 0.000
> > -1.658877 -.5471196
> > |
> > _cons | 6.306457 .0505186 124.83 0.000
> > 6.207308 6.405607
> > --------------------------------------------------------------
> > ----------------
> >
> >
> >
> > The sample size is less than half in xtivreg. What is
> particularly odd
> > is the fact that the 2nd stage coefficients are very close and the
> > reported observations and groups are now 1926 and 1025 for
> xtivreg, so
> > just one group less than xtivreg. Is it perhaps some
> reporting issue?
> > I really don't understand this. Just so I'm clear, the large
> > difference, especially in the coefficient of the instrument
> is in the
> > first stage, while the second stages are very similar.
>
> This is curious. I am just about to travel but I will look into it.
>
> One thing that comes to mind is that xtivreg2 with FDs may be
> reporting N=the entire sample including the singletons (group
> size=1) that drop out.
>
> Perhaps try running -xtivreg,fd- and then -xtivreg2,fd if
> e(sample)- so that they use the same sample. Do you get the
> same results?
>
> > 3)
> >
> > >This is confusing. Do you mean that w_it and q_it are
> > correlated with
> > >x_it? That's not a problem. The key requirement is that in
> > >
> > >y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it +
> q_it + e_it ,
> > >
> > >w_it and q_it should be uncorrelated with e_it.
> >
> > Yes, w_it and q_it are correlated with x_it and with e_it.
>
> To repeat, correlation with x_it is not a problem, but
> correlation with e_it is.
>
> > >If they are correlated,
>
> with e_it (sorry)
>
> > then you have two options: (1) Add w
> > and q to
> > >your list of endogenous variables. But as you say, you will need
> > >instruments for them. And if you aren't interested in a causal
> > >interpretation, then maybe you shouldn't bother. (2)
> > Instead of using
> > >w and q as regressors and instrumenting them, insert the
> > instruments as (exogenous) regressors.
> >
> > I am not interested in instrumenting, just conditioning, but my
> > problem is that once I add them the previously very high
> F-stat in the
> > first stage goes down to the point of indicating weak instruments.
>
> It sounds like that's because the instrumenting of either w
> or q is weak. But that's what the Angrist-Pischke F-stats
> are for. If the AP F-stat for the regressor of interest is
> respectable, then you're OK.
>
> > That is basically my concern.
> > Concerning your second advice, do you mean that I should
> just drop w
> > and q from the model and replace them with instruments?
>
> Yes, exactly. You'll be estimating a semi-reduced form.
>
> --Mark
>
> > Thanks again for your help, it is very much appreciated.
> >
> > Best,
> >
> > Filippos
> >
> > >>> "Schaffer, Mark E" <[email protected]> 06/11/12 7:02 AM >>>
> > Filippos,
> >
> > You need to tell us more - what versions of software you are using,
> > what the actual output is (or the relevant pieces of the
> output), etc.
> >
> > More comments below.
> >
> > > -----Original Message-----
> > > From: [email protected]
> > > [mailto:[email protected]] On Behalf
> Of Filippos
> > > Petroulakis
> > > Sent: Sunday, June 10, 2012 11:59 PM
> > > To: [email protected]
> > > Subject: st: First stage of panel IV
> > >
> > > Hi all,
> > >
> > > I am running a panel first differences (fixed effects) model.
> > > My regression is of the sort
> > >
> > > y_it=b_0+b_1 x_it + b_k demographic_covariates + e_it
> > >
> > > x_it is endogenous so I have an instrument z_it.
> > >
> > > I essentially have 3 issues and I list them in descending
> order of
> > > importance.
> > >
> > > 1) xtivreg2 is running the first stage of x_it on z_it and the
> > > exogenous demographic covariates as OLS instead of fixed
> effects or
> > > first differences.
> >
> > I doubt it very much (having programmed -xtivreg2-, I think
> I'm well
> > placed to say this!). -xtivreg2- follows the standard procedure of
> > transforming the full set of variables used in the same
> say, i.e., the
> > within or between transformation is applied to all variables.
> >
> > > I honestly do not know whether
> > > this is due to theory but it seems to be very odd,
> especially given
> > > the fixed effects is definitely the correct specification for the
> > > model as a whole, and so I would think it has to be the
> > case for the
> > > first stage as well. I can do the 2 stages manually and
> correct the
> > > errors using the process outlined here
> > > (http://www.stata.com/support/faqs/stat/ivreg.html) and I
> > presume the
> > > fact that I have a panel doesn't change much, but my issue is
> > > basically whether this is the correct thing to do.
> > >
> > > 2) xtivreg and xtivreg2 give me pretty different results,
> > which is due
> > > to the fact that xtivreg drops about half of the
> > observations in the
> > > first stage. I checked and the variable that is causing the
> > dropping
> > > (for whatever reason) is the dependent variable. I am
> thus positive
> > > that xtivreg is the wrong one but am still worried. Anyone
> > knows why
> > > this happens?
> >
> > Again, I doubt the problem is the one you suspect. My
> guess is that
> > Most likely you are using different estimators, e.g., fixed effects
> > with
> > -xtivreg2- and random effects with -xtivreg-. But you need
> to show us
> > the output.
> >
> > > 3) Finally, at some point I will need to include a further two
> > > variables, call them w_it and q_it, which are surely
> endogenous. I
> > > don't care about instrumenting for them as I am not
> interested in a
> > > causal interpretation, but the problem is that they are also
> > > endogenous to x_it. So my first stage will be regression x_it on
> > > variables that are endogenous to itself and to y_it. Is
> > that an issue
> > > I should be concerned about?
> >
> > This is confusing. Do you mean that w_it and q_it are
> correlated with
> > x_it? That's not a problem. The key requirement is that in
> >
> > y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it +
> q_it + e_it ,
> >
> > w_it and q_it should be uncorrelated with e_it. If they are
> > correlated, then you have two options: (1) Add w and q to
> your list of
> > endogenous variables. But as you say, you will need
> instruments for
> > them. And if you aren't interested in a causal
> interpretation, then
> > maybe you shouldn't bother. (2) Instead of using w and q as
> > regressors and instrumenting them, insert the instruments as
> > (exogenous) regressors.
> >
> > HTH,
> > Mark
> >
> > > Thank you very much in advance - answers to any or all of
> > those issues
> > > will be immensely appreciated.
> > >
> > > Best,
> > >
> > > Filippos Petroulakis
> > >
> > > *
> > > * For searches and help try:
> > > * http://www.stata.com/help.cgi?search
> > > * http://www.stata.com/support/statalist/faq
> > > * http://www.ats.ucla.edu/stat/stata/
> > >
> >
> >
> > --
> > Heriot-Watt University is the Sunday Times Scottish
> University of the
> > Year 2011-2012
> >
> > Heriot-Watt University is a Scottish charity registered
> under charity
> > number SC000278.
> >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
>
> --
> Heriot-Watt University is the Sunday Times Scottish
> University of the Year 2011-2012
>
> Heriot-Watt University is a Scottish charity registered under
> charity number SC000278.
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/