Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: First stage of panel IV
From
"Filippos Petroulakis" <[email protected]>
To
<[email protected]>
Subject
Re: st: RE: First stage of panel IV
Date
Mon, 11 Jun 2012 23:10:38 -0400
Hi Mark and thanks for your response. I am using Stata 11 and the up to date version of xtivreg2. I'll start another list as it's getting too crowded
1) I was probably mistaken about this. Just to make sure, is running OLS on a panel with all variables differenced identical to running first differences?
2) About xtivreg versus xtivreg2, I am certain and to make sure I run xtivreg2 and then copy the code and then just remove the 2. For xtivreg2 the first stage output is
.
. xtivreg2 y ( x =z) ///
l.bzo ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus ln_fraction_male ln_pop_hisp_all
ln_pop_nh_black ln_pop_nh_white, fd small first robust
FIRST DIFFERENCES ESTIMATION
----------------------------
Number of groups = 1026 Obs per group: min = 1
avg = 1.9
max = 2
First-stage regressions
-----------------------
First-stage regression of D.ln_stim_forfd:
OLS estimation
--------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity
Number of obs = 1928
F( 11, 1916) = 53.82
Prob > F = 0.0000
Total (centered) SS = 1631.878852 Centered R2 = 0.2035
Total (uncentered) SS = 59641.31821 Uncentered R2 = 0.9782
Residual SS = 1299.720342 Root MSE = .8236
------------------------------------------------------------------------------
D. | Robust
x | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_bzo |
LD. | -.1256429 .0767888 -1.64 0.102 -.2762413 .0249555
|
ln_tunder15 |
D1. | 5.698369 3.323788 1.71 0.087 -.8202531 12.21699
|
ln_t15to24 |
D1. | 6.831107 2.217364 3.08 0.002 2.482406 11.17981
|
ln_t25to44 |
D1. | 32.90151 3.672888 8.96 0.000 25.69823 40.10479
|
ln_t45to64 |
D1. | 8.227723 3.417776 2.41 0.016 1.524771 14.93068
|
ln_t65plus |
D1. | 4.88607 2.705773 1.81 0.071 -.4205002 10.19264
|
ln_fractio~e |
D1. | -3.86913 9.62245 -0.40 0.688 -22.74071 15.00245
|
ln_pop_his~l |
D1. | -6.352615 1.729689 -3.67 0.000 -9.744887 -2.960343
|
ln_pop_nh_~k |
D1. | 7.426661 3.028961 2.45 0.014 1.486254 13.36707
|
ln_pop_nh_~e |
D1. | -5.481275 3.728017 -1.47 0.142 -12.79267 1.830123
|
z |
D1. | 3.404714 .2067276 16.47 0.000 2.999279 3.810149
|
_cons | 5.682521 .0508073 111.84 0.000 5.582877 5.782164
------------------------------------------------------------------------------
With xtivreg it is
. xtivreg y ( x =z) ///
l.ln_crime ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus ln_fraction_male ln_pop_hisp_all
ln_pop_nh_black ln_pop_nh_white, fd small first
First-stage first-differenced regression
Source | SS df MS Number of obs = 901
-------------+------------------------------ F( 11, 889) = 17.42
Model | 58.4561519 11 5.31419563 Prob > F = 0.0000
Residual | 271.136523 889 .304990465 R-squared = 0.1774
-------------+------------------------------ Adj R-squared = 0.1672
Total | 329.592675 900 .366214084 Root MSE = .55226
------------------------------------------------------------------------------
D. |
x | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_bzo |
LD. | -.0553261 .0661387 -0.84 0.403 -.1851322 .0744801
|
ln_tunder15 |
D1. | 14.637 2.988974 4.90 0.000 8.770729 20.50326
|
ln_t15to24 |
D1. | 12.06724 2.274707 5.30 0.000 7.602818 16.53166
|
ln_t25to44 |
D1. | 28.67612 3.411086 8.41 0.000 21.9814 35.37084
|
ln_t45to64 |
D1. | 9.750316 3.753647 2.60 0.010 2.383274 17.11736
|
ln_t65plus |
D1. | 9.117776 2.449422 3.72 0.000 4.310454 13.9251
|
ln_fractio~e |
D1. | 20.81802 9.297503 2.24 0.025 2.570409 39.06564
|
ln_pop_his~l |
D1. | -3.244084 1.379805 -2.35 0.019 -5.95214 -.5360282
|
ln_pop_nh_~k |
D1. | -2.669608 2.386585 -1.12 0.264 -7.353605 2.01439
|
ln_pop_nh_~e |
D1. | -18.12162 4.160963 -4.36 0.000 -26.28808 -9.955169
|
z |
D1. | -1.102999 .2832308 -3.89 0.000 -1.658877 -.5471196
|
_cons | 6.306457 .0505186 124.83 0.000 6.207308 6.405607
------------------------------------------------------------------------------
The sample size is less than half in xtivreg. What is particularly odd is the fact that the 2nd stage coefficients are very close and the reported observations and groups are now 1926 and 1025 for xtivreg, so just one group less than xtivreg. Is it perhaps some reporting issue? I really don't understand this. Just so I'm clear, the large difference, especially in the coefficient of the instrument is in the first stage, while the second stages are very similar.
3)
>This is confusing. Do you mean that w_it and q_it are correlated with
>x_it? That's not a problem. The key requirement is that in
>
>y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,
>
>w_it and q_it should be uncorrelated with e_it.
Yes, w_it and q_it are correlated with x_it and with e_it.
>If they are correlated, then you have two options: (1) Add w and q to your list of endogenous
>variables. But as you say, you will need instruments for them. And if
>you aren't interested in a causal interpretation, then maybe you
>shouldn't bother. (2) Instead of using w and q as regressors and
>instrumenting them, insert the instruments as (exogenous) regressors.
I am not interested in instrumenting, just conditioning, but my problem is that once I add them the previously very high F-stat in the first stage goes down to the point of indicating weak instruments. That is basically my concern. Concerning your second advice, do you mean that I should just drop w and q from the model and replace them with instruments?
Thanks again for your help, it is very much appreciated.
Best,
Filippos
>>> "Schaffer, Mark E" <[email protected]> 06/11/12 7:02 AM >>>
Filippos,
You need to tell us more - what versions of software you are using, what
the actual output is (or the relevant pieces of the output), etc.
More comments below.
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Filippos Petroulakis
> Sent: Sunday, June 10, 2012 11:59 PM
> To: [email protected]
> Subject: st: First stage of panel IV
>
> Hi all,
>
> I am running a panel first differences (fixed effects) model.
> My regression is of the sort
>
> y_it=b_0+b_1 x_it + b_k demographic_covariates + e_it
>
> x_it is endogenous so I have an instrument z_it.
>
> I essentially have 3 issues and I list them in descending
> order of importance.
>
> 1) xtivreg2 is running the first stage of x_it on z_it and
> the exogenous demographic covariates as OLS instead of fixed
> effects or first differences.
I doubt it very much (having programmed -xtivreg2-, I think I'm well
placed to say this!). -xtivreg2- follows the standard procedure of
transforming the full set of variables used in the same say, i.e., the
within or between transformation is applied to all variables.
> I honestly do not know whether
> this is due to theory but it seems to be very odd, especially
> given the fixed effects is definitely the correct
> specification for the model as a whole, and so I would think
> it has to be the case for the first stage as well. I can do
> the 2 stages manually and correct the errors using the
> process outlined here
> (http://www.stata.com/support/faqs/stat/ivreg.html) and I
> presume the fact that I have a panel doesn't change much, but
> my issue is basically whether this is the correct thing to do.
>
> 2) xtivreg and xtivreg2 give me pretty different results,
> which is due to the fact that xtivreg drops about half of the
> observations in the first stage. I checked and the variable
> that is causing the dropping (for whatever reason) is the
> dependent variable. I am thus positive that xtivreg is the
> wrong one but am still worried. Anyone knows why this happens?
Again, I doubt the problem is the one you suspect. My guess is that
Most likely you are using different estimators, e.g., fixed effects with
-xtivreg2- and random effects with -xtivreg-. But you need to show us
the output.
> 3) Finally, at some point I will need to include a further
> two variables, call them w_it and q_it, which are surely
> endogenous. I don't care about instrumenting for them as I am
> not interested in a causal interpretation, but the problem is
> that they are also endogenous to x_it. So my first stage will
> be regression x_it on variables that are endogenous to itself
> and to y_it. Is that an issue I should be concerned about?
This is confusing. Do you mean that w_it and q_it are correlated with
x_it? That's not a problem. The key requirement is that in
y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,
w_it and q_it should be uncorrelated with e_it. If they are correlated,
then you have two options: (1) Add w and q to your list of endogenous
variables. But as you say, you will need instruments for them. And if
you aren't interested in a causal interpretation, then maybe you
shouldn't bother. (2) Instead of using w and q as regressors and
instrumenting them, insert the instruments as (exogenous) regressors.
HTH,
Mark
> Thank you very much in advance - answers to any or all of
> those issues will be immensely appreciated.
>
> Best,
>
> Filippos Petroulakis
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/