Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: First stage of panel IV

From	"Filippos Petroulakis" <[email protected]>
To	<[email protected]>
Subject	Re: st: RE: First stage of panel IV
Date	Mon, 11 Jun 2012 23:10:38 -0400
Hi Mark and thanks for your response. I am using Stata 11 and the up to date version of xtivreg2. I'll start another list as it's getting too crowded

1) I  was probably mistaken about this. Just to make sure, is running OLS on a panel with all variables differenced identical to running first differences?

2) About xtivreg versus xtivreg2, I am certain and to make sure I run xtivreg2 and then copy the code and then just remove the 2. For xtivreg2 the first stage output is


.   
.   xtivreg2 y ( x =z)  ///
l.bzo  ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus ln_fraction_male ln_pop_hisp_all 
ln_pop_nh_black ln_pop_nh_white, fd   small first robust

FIRST DIFFERENCES ESTIMATION
----------------------------
Number of groups =      1026                    Obs per group: min =         1
                                                               avg =       1.9
                                                               max =         2

First-stage regressions
-----------------------

First-stage regression of D.ln_stim_forfd:

OLS estimation
--------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity

                                                      Number of obs =     1928
                                                      F( 11,  1916) =    53.82
                                                      Prob > F      =   0.0000
Total (centered) SS     =  1631.878852                Centered R2   =   0.2035
Total (uncentered) SS   =  59641.31821                Uncentered R2 =   0.9782
Residual SS             =  1299.720342                Root MSE      =    .8236

------------------------------------------------------------------------------
D.           |               Robust
x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_bzo |
         LD. |  -.1256429   .0767888    -1.64   0.102    -.2762413    .0249555
             |
 ln_tunder15 |
         D1. |   5.698369   3.323788     1.71   0.087    -.8202531    12.21699
             |
  ln_t15to24 |
         D1. |   6.831107   2.217364     3.08   0.002     2.482406    11.17981
             |
  ln_t25to44 |
         D1. |   32.90151   3.672888     8.96   0.000     25.69823    40.10479
             |
  ln_t45to64 |
         D1. |   8.227723   3.417776     2.41   0.016     1.524771    14.93068
             |
  ln_t65plus |
         D1. |    4.88607   2.705773     1.81   0.071    -.4205002    10.19264
             |
ln_fractio~e |
         D1. |   -3.86913    9.62245    -0.40   0.688    -22.74071    15.00245
             |
ln_pop_his~l |
         D1. |  -6.352615   1.729689    -3.67   0.000    -9.744887   -2.960343
             |
ln_pop_nh_~k |
         D1. |   7.426661   3.028961     2.45   0.014     1.486254    13.36707
             |
ln_pop_nh_~e |
         D1. |  -5.481275   3.728017    -1.47   0.142    -12.79267    1.830123
             |
z |
         D1. |   3.404714   .2067276    16.47   0.000     2.999279    3.810149
             |
       _cons |   5.682521   .0508073   111.84   0.000     5.582877    5.782164
------------------------------------------------------------------------------


With xtivreg it is


.   xtivreg y ( x =z)  ///
l.ln_crime  ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus ln_fraction_male ln_pop_hisp_all 
ln_pop_nh_black ln_pop_nh_white, fd   small first

First-stage first-differenced regression

      Source |       SS       df       MS              Number of obs =     901
-------------+------------------------------           F( 11,   889) =   17.42
       Model |  58.4561519    11  5.31419563           Prob > F      =  0.0000
    Residual |  271.136523   889  .304990465           R-squared     =  0.1774
-------------+------------------------------           Adj R-squared =  0.1672
       Total |  329.592675   900  .366214084           Root MSE      =  .55226

------------------------------------------------------------------------------
D.           |
x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_bzo |
         LD. |  -.0553261   .0661387    -0.84   0.403    -.1851322    .0744801
             |
 ln_tunder15 |
         D1. |     14.637   2.988974     4.90   0.000     8.770729    20.50326
             |
  ln_t15to24 |
         D1. |   12.06724   2.274707     5.30   0.000     7.602818    16.53166
             |
  ln_t25to44 |
         D1. |   28.67612   3.411086     8.41   0.000      21.9814    35.37084
             |
  ln_t45to64 |
         D1. |   9.750316   3.753647     2.60   0.010     2.383274    17.11736
             |
  ln_t65plus |
         D1. |   9.117776   2.449422     3.72   0.000     4.310454     13.9251
             |
ln_fractio~e |
         D1. |   20.81802   9.297503     2.24   0.025     2.570409    39.06564
             |
ln_pop_his~l |
         D1. |  -3.244084   1.379805    -2.35   0.019     -5.95214   -.5360282
             |
ln_pop_nh_~k |
         D1. |  -2.669608   2.386585    -1.12   0.264    -7.353605     2.01439
             |
ln_pop_nh_~e |
         D1. |  -18.12162   4.160963    -4.36   0.000    -26.28808   -9.955169
             |
z |
         D1. |  -1.102999   .2832308    -3.89   0.000    -1.658877   -.5471196
             |
       _cons |   6.306457   .0505186   124.83   0.000     6.207308    6.405607
------------------------------------------------------------------------------



The sample size is less than half in xtivreg. What is particularly odd is the fact that the 2nd stage coefficients are very close and the reported observations and groups are now 1926 and 1025 for xtivreg, so just one group less than xtivreg. Is it perhaps some reporting issue? I really don't understand this. Just so I'm clear, the large difference, especially in the coefficient of the instrument is in the first stage, while the second stages are very similar.

3)

>This is confusing.  Do you mean that w_it and q_it are correlated with
>x_it?  That's not a problem.  The key requirement is that in
>
>y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,
>
>w_it and q_it should be uncorrelated with e_it. 

Yes, w_it and q_it are correlated with x_it and with e_it. 

>If they are correlated, then you have two options: (1) Add w and q to your list of endogenous
>variables.  But as you say, you will need instruments for them.  And if
>you aren't interested in a causal interpretation, then maybe you
>shouldn't bother.  (2)  Instead of using w and q as regressors and
>instrumenting them, insert the instruments as (exogenous) regressors.

I am not interested in instrumenting, just conditioning, but my problem is that once I add them the previously very high F-stat in the first stage goes down to the point of indicating weak instruments. That is basically my concern. Concerning your second advice, do you mean that I should just drop w and q from the model and replace them with instruments? 

Thanks again for your help, it is very much appreciated.

Best,

Filippos

>>> "Schaffer, Mark E" <[email protected]> 06/11/12 7:02 AM >>>
Filippos,

You need to tell us more - what versions of software you are using, what
the actual output is (or the relevant pieces of the output), etc.

More comments below.

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> Filippos Petroulakis
> Sent: Sunday, June 10, 2012 11:59 PM
> To: [email protected]
> Subject: st: First stage of panel IV
> 
> Hi all,
> 
> I am running a panel first differences (fixed effects) model. 
> My regression is of the sort
> 
> y_it=b_0+b_1 x_it + b_k demographic_covariates + e_it
> 
> x_it is endogenous so I have an instrument z_it.
> 
> I essentially have 3 issues and I list them in descending 
> order of importance.
> 
> 1) xtivreg2 is running the first stage of x_it on z_it and 
> the exogenous demographic covariates as OLS instead of fixed 
> effects or first differences.

I doubt it very much (having programmed -xtivreg2-, I think I'm well
placed to say this!).  -xtivreg2- follows the standard procedure of
transforming the full set of variables used in the same say, i.e., the
within or between transformation is applied to all variables.

> I honestly do not know whether 
> this is due to theory but it seems to be very odd, especially 
> given the fixed effects is definitely the correct 
> specification for the model as a whole, and so I would think 
> it has to be the case for the first stage as well. I can do 
> the 2 stages manually and correct the errors using the 
> process outlined here 
> (http://www.stata.com/support/faqs/stat/ivreg.html) and I 
> presume the fact that I have a panel doesn't change much, but 
> my issue is basically whether this is the correct thing to do.
> 
> 2) xtivreg and xtivreg2 give me pretty different results, 
> which is due to the fact that xtivreg drops about half of the 
> observations in the first stage. I checked and the variable 
> that is causing the dropping (for whatever reason) is the 
> dependent variable. I am thus positive that xtivreg is the 
> wrong one but am still worried. Anyone knows why this happens?

Again, I doubt the problem is the one you suspect.  My guess is that
Most likely you are using different estimators, e.g., fixed effects with
-xtivreg2- and random effects with -xtivreg-.  But you need to show us
the output.

> 3) Finally, at some point I will need to include a further 
> two variables, call them w_it and q_it, which are surely 
> endogenous. I don't care about instrumenting for them as I am 
> not interested in a causal interpretation, but the problem is 
> that they are also endogenous to x_it. So my first stage will 
> be regression x_it on variables that are endogenous to itself 
> and to y_it. Is that an issue I should be concerned about?

This is confusing.  Do you mean that w_it and q_it are correlated with
x_it?  That's not a problem.  The key requirement is that in

y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,

w_it and q_it should be uncorrelated with e_it.  If they are correlated,
then you have two options: (1) Add w and q to your list of endogenous
variables.  But as you say, you will need instruments for them.  And if
you aren't interested in a causal interpretation, then maybe you
shouldn't bother.  (2)  Instead of using w and q as regressors and
instrumenting them, insert the instruments as (exogenous) regressors.

HTH,
Mark

> Thank you very much in advance - answers to any or all of 
> those issues will be immensely appreciated.
> 
> Best,
> 
> Filippos Petroulakis 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


-- 
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- RE: st: RE: First stage of panel IV
  - From: "Schaffer, Mark E" <[email protected]>
Prev by Date: Re: st: count data truncated at one
Next by Date: Re: st: count data truncated at one
Previous by thread: st: xtlogit, re for a 5% random sample
Next by thread: RE: st: RE: First stage of panel IV
Index(es):
- Date
- Thread