Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: First stage of panel IV
From
"Schaffer, Mark E" <[email protected]>
To
<[email protected]>
Subject
RE: st: RE: First stage of panel IV
Date
Tue, 12 Jun 2012 09:35:19 +0100
Filippos,
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Filippos Petroulakis
> Sent: 12 June 2012 04:11
> To: [email protected]
> Subject: Re: st: RE: First stage of panel IV
>
> Hi Mark and thanks for your response. I am using Stata 11 and
> the up to date version of xtivreg2.
Can you check/report to us your versions of xtivreg2, ivreg2 and
ranktest?
> I'll start another list
> as it's getting too crowded
>
> 1) I was probably mistaken about this. Just to make sure, is
> running OLS on a panel with all variables differenced
> identical to running first differences?
Should be the case.
> 2) About xtivreg versus xtivreg2, I am certain and to make
> sure I run xtivreg2 and then copy the code and then just
> remove the 2. For xtivreg2 the first stage output is
>
>
> .
> . xtivreg2 y ( x =z) ///
> l.bzo ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64
> ln_t65plus ln_fraction_male ln_pop_hisp_all
> ln_pop_nh_black ln_pop_nh_white, fd small first robust
>
> FIRST DIFFERENCES ESTIMATION
> ----------------------------
> Number of groups = 1026 Obs per
> group: min = 1
>
> avg = 1.9
>
> max = 2
>
> First-stage regressions
> -----------------------
>
> First-stage regression of D.ln_stim_forfd:
>
> OLS estimation
> --------------
>
> Estimates efficient for homoskedasticity only Statistics
> robust to heteroskedasticity
>
> Number
> of obs = 1928
> F( 11,
> 1916) = 53.82
> Prob >
> F = 0.0000
> Total (centered) SS = 1631.878852
> Centered R2 = 0.2035
> Total (uncentered) SS = 59641.31821
> Uncentered R2 = 0.9782
> Residual SS = 1299.720342 Root
> MSE = .8236
>
> --------------------------------------------------------------
> ----------------
> D. | Robust
> x | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+------------------------------------------------
> ----------
> -------------+------
> ln_bzo |
> LD. | -.1256429 .0767888 -1.64 0.102
> -.2762413 .0249555
> |
> ln_tunder15 |
> D1. | 5.698369 3.323788 1.71 0.087
> -.8202531 12.21699
> |
> ln_t15to24 |
> D1. | 6.831107 2.217364 3.08 0.002
> 2.482406 11.17981
> |
> ln_t25to44 |
> D1. | 32.90151 3.672888 8.96 0.000
> 25.69823 40.10479
> |
> ln_t45to64 |
> D1. | 8.227723 3.417776 2.41 0.016
> 1.524771 14.93068
> |
> ln_t65plus |
> D1. | 4.88607 2.705773 1.81 0.071
> -.4205002 10.19264
> |
> ln_fractio~e |
> D1. | -3.86913 9.62245 -0.40 0.688
> -22.74071 15.00245
> |
> ln_pop_his~l |
> D1. | -6.352615 1.729689 -3.67 0.000
> -9.744887 -2.960343
> |
> ln_pop_nh_~k |
> D1. | 7.426661 3.028961 2.45 0.014
> 1.486254 13.36707
> |
> ln_pop_nh_~e |
> D1. | -5.481275 3.728017 -1.47 0.142
> -12.79267 1.830123
> |
> z |
> D1. | 3.404714 .2067276 16.47 0.000
> 2.999279 3.810149
> |
> _cons | 5.682521 .0508073 111.84 0.000
> 5.582877 5.782164
> --------------------------------------------------------------
> ----------------
>
>
> With xtivreg it is
>
>
> . xtivreg y ( x =z) ///
> l.ln_crime ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64
> ln_t65plus ln_fraction_male ln_pop_hisp_all
> ln_pop_nh_black ln_pop_nh_white, fd small first
>
> First-stage first-differenced regression
>
> Source | SS df MS Number
> of obs = 901
> -------------+------------------------------ F( 11,
> 889) = 17.42
> Model | 58.4561519 11 5.31419563 Prob >
> F = 0.0000
> Residual | 271.136523 889 .304990465
> R-squared = 0.1774
> -------------+------------------------------ Adj
> R-squared = 0.1672
> Total | 329.592675 900 .366214084 Root
> MSE = .55226
>
> --------------------------------------------------------------
> ----------------
> D. |
> x | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+------------------------------------------------
> ----------
> -------------+------
> ln_bzo |
> LD. | -.0553261 .0661387 -0.84 0.403
> -.1851322 .0744801
> |
> ln_tunder15 |
> D1. | 14.637 2.988974 4.90 0.000
> 8.770729 20.50326
> |
> ln_t15to24 |
> D1. | 12.06724 2.274707 5.30 0.000
> 7.602818 16.53166
> |
> ln_t25to44 |
> D1. | 28.67612 3.411086 8.41 0.000
> 21.9814 35.37084
> |
> ln_t45to64 |
> D1. | 9.750316 3.753647 2.60 0.010
> 2.383274 17.11736
> |
> ln_t65plus |
> D1. | 9.117776 2.449422 3.72 0.000
> 4.310454 13.9251
> |
> ln_fractio~e |
> D1. | 20.81802 9.297503 2.24 0.025
> 2.570409 39.06564
> |
> ln_pop_his~l |
> D1. | -3.244084 1.379805 -2.35 0.019
> -5.95214 -.5360282
> |
> ln_pop_nh_~k |
> D1. | -2.669608 2.386585 -1.12 0.264
> -7.353605 2.01439
> |
> ln_pop_nh_~e |
> D1. | -18.12162 4.160963 -4.36 0.000
> -26.28808 -9.955169
> |
> z |
> D1. | -1.102999 .2832308 -3.89 0.000
> -1.658877 -.5471196
> |
> _cons | 6.306457 .0505186 124.83 0.000
> 6.207308 6.405607
> --------------------------------------------------------------
> ----------------
>
>
>
> The sample size is less than half in xtivreg. What is
> particularly odd is the fact that the 2nd stage coefficients
> are very close and the reported observations and groups are
> now 1926 and 1025 for xtivreg, so just one group less than
> xtivreg. Is it perhaps some reporting issue? I really don't
> understand this. Just so I'm clear, the large difference,
> especially in the coefficient of the instrument is in the
> first stage, while the second stages are very similar.
This is curious. I am just about to travel but I will look into it.
One thing that comes to mind is that xtivreg2 with FDs may be reporting
N=the entire sample including the singletons (group size=1) that drop
out.
Perhaps try running -xtivreg,fd- and then -xtivreg2,fd if e(sample)- so
that they use the same sample. Do you get the same results?
> 3)
>
> >This is confusing. Do you mean that w_it and q_it are
> correlated with
> >x_it? That's not a problem. The key requirement is that in
> >
> >y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,
> >
> >w_it and q_it should be uncorrelated with e_it.
>
> Yes, w_it and q_it are correlated with x_it and with e_it.
To repeat, correlation with x_it is not a problem, but correlation with
e_it is.
> >If they are correlated,
with e_it (sorry)
> then you have two options: (1) Add w
> and q to
> >your list of endogenous variables. But as you say, you will need
> >instruments for them. And if you aren't interested in a causal
> >interpretation, then maybe you shouldn't bother. (2)
> Instead of using
> >w and q as regressors and instrumenting them, insert the
> instruments as (exogenous) regressors.
>
> I am not interested in instrumenting, just conditioning, but
> my problem is that once I add them the previously very high
> F-stat in the first stage goes down to the point of
> indicating weak instruments.
It sounds like that's because the instrumenting of either w or q is
weak. But that's what the Angrist-Pischke F-stats are for. If the AP
F-stat for the regressor of interest is respectable, then you're OK.
> That is basically my concern.
> Concerning your second advice, do you mean that I should just
> drop w and q from the model and replace them with instruments?
Yes, exactly. You'll be estimating a semi-reduced form.
--Mark
> Thanks again for your help, it is very much appreciated.
>
> Best,
>
> Filippos
>
> >>> "Schaffer, Mark E" <[email protected]> 06/11/12 7:02 AM >>>
> Filippos,
>
> You need to tell us more - what versions of software you are
> using, what the actual output is (or the relevant pieces of
> the output), etc.
>
> More comments below.
>
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of Filippos
> > Petroulakis
> > Sent: Sunday, June 10, 2012 11:59 PM
> > To: [email protected]
> > Subject: st: First stage of panel IV
> >
> > Hi all,
> >
> > I am running a panel first differences (fixed effects) model.
> > My regression is of the sort
> >
> > y_it=b_0+b_1 x_it + b_k demographic_covariates + e_it
> >
> > x_it is endogenous so I have an instrument z_it.
> >
> > I essentially have 3 issues and I list them in descending order of
> > importance.
> >
> > 1) xtivreg2 is running the first stage of x_it on z_it and the
> > exogenous demographic covariates as OLS instead of fixed effects or
> > first differences.
>
> I doubt it very much (having programmed -xtivreg2-, I think
> I'm well placed to say this!). -xtivreg2- follows the
> standard procedure of transforming the full set of variables
> used in the same say, i.e., the within or between
> transformation is applied to all variables.
>
> > I honestly do not know whether
> > this is due to theory but it seems to be very odd, especially given
> > the fixed effects is definitely the correct specification for the
> > model as a whole, and so I would think it has to be the
> case for the
> > first stage as well. I can do the 2 stages manually and correct the
> > errors using the process outlined here
> > (http://www.stata.com/support/faqs/stat/ivreg.html) and I
> presume the
> > fact that I have a panel doesn't change much, but my issue is
> > basically whether this is the correct thing to do.
> >
> > 2) xtivreg and xtivreg2 give me pretty different results,
> which is due
> > to the fact that xtivreg drops about half of the
> observations in the
> > first stage. I checked and the variable that is causing the
> dropping
> > (for whatever reason) is the dependent variable. I am thus positive
> > that xtivreg is the wrong one but am still worried. Anyone
> knows why
> > this happens?
>
> Again, I doubt the problem is the one you suspect. My guess
> is that Most likely you are using different estimators, e.g.,
> fixed effects with
> -xtivreg2- and random effects with -xtivreg-. But you need
> to show us the output.
>
> > 3) Finally, at some point I will need to include a further two
> > variables, call them w_it and q_it, which are surely endogenous. I
> > don't care about instrumenting for them as I am not interested in a
> > causal interpretation, but the problem is that they are also
> > endogenous to x_it. So my first stage will be regression x_it on
> > variables that are endogenous to itself and to y_it. Is
> that an issue
> > I should be concerned about?
>
> This is confusing. Do you mean that w_it and q_it are
> correlated with x_it? That's not a problem. The key
> requirement is that in
>
> y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,
>
> w_it and q_it should be uncorrelated with e_it. If they are
> correlated, then you have two options: (1) Add w and q to
> your list of endogenous variables. But as you say, you will
> need instruments for them. And if you aren't interested in a
> causal interpretation, then maybe you shouldn't bother. (2)
> Instead of using w and q as regressors and instrumenting
> them, insert the instruments as (exogenous) regressors.
>
> HTH,
> Mark
>
> > Thank you very much in advance - answers to any or all of
> those issues
> > will be immensely appreciated.
> >
> > Best,
> >
> > Filippos Petroulakis
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
>
> --
> Heriot-Watt University is the Sunday Times Scottish
> University of the Year 2011-2012
>
> Heriot-Watt University is a Scottish charity registered under
> charity number SC000278.
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/