Mark
I rechecked my comments and found that you are right. GLS is still a
consistent (inefficient and noisy) estimator under het/auto then your
solution is valid. But let me compare the procedures in word: (1) yours
controls for the RE (w/wrong weights) then applies IV to obtain the
third-round coefficients and uses robust std errors and (2) mine stops HT at
step 2 and uses robust std error (keeping in mind the fact that IV
coefficient were obtained in a second round). After our discussion both
procedures are consistent and efficient, but numerically will give us
different results in coefficients as well std-errors. Very interesting. I
have 2 more comments about your procedure: (1) it needs needs (as same as
HT) some (extra) exogeneity in time-variant variables (to do the last IV
procedure) and (2) it generates some extra noise to the variables computing
a wrong GLS factor.
Rodrigo.
PS: We can continue the discussion off the list if you want.
----- Original Message -----
From: "Schaffer, Mark E" <[email protected]>
To: <[email protected]>
Sent: Tuesday, May 02, 2006 12:35 PM
Subject: RE: st: RE: Hausman taylor
Rodrigo,
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Rodrigo A. Alfaro
> Sent: 02 May 2006 16:12
> To: [email protected]
> Subject: Re: st: RE: Hausman taylor
>
> Mark,
>
> This is very interesting discussion. My point is that under
> autocorrelation and/or heteroskedasticity you cannot generate
> consistent estimator for variance of the error term,
> therefore the GLS transformation applied in the last step of
> original-HT is wrong. For this reason, I cannot see that the
> coefficients of modified-HT can be consistent, based on that
> in your suggestion is still using the wrong GLS
> transformation.
I agree, this is interesting. But I am pretty sure that the HT coefficients
are consistent in the presence of het. or AC. Here are two reasons:
1. The GLS transform used is a weighted average of the within and between
estimators (HT, p. 1381). A weighted average of two consistent estimators
will be consistent (except perhaps in special cases constructed by
specialists, i.e., not me).
2. In the standard random effects estimator, in the presence of het./AC,
you also cannot obtain a consistent estimator for the variance of the error
term - just as you say for HT. The GLS transform applied to get the random
effects estimator is therefore "wrong" - but only in the sense that it isn't
an *efficient* estimator. It's still consistent. That's why various
textbooks (e.g., Wooldridge 2002) point out that one can use the
cluster-robust covariance estimate to get consistent SEs for the random
effects estimator even in the presence of het./AC. The same argument should
[sic!] apply to HT, no?
> Mind that original-GLS transformations uses
> the variance of the residual as a scalar and now it is an
> unknown matrix.
>
> As I said early, coefficients on the previous steps are
> consistent, but inefficient. Indeed, the section 2.3 in the
> paper is called "Consistent but Inefficient Estimation". I
> think that the Julia's problem can be solved but keeping the
> FE (time-variant variables) and IV (time-invariant variables)
> coefficients and generating a non-parametric std error as
> Newey-West procedure does.
This is a good idea. Another way to put it would be to say that the last
step of HT generates efficient estimators of the coefficients only under
homoskedasticity. If this assumption fails, then HT is consistent but not
efficient (my point above). In that case, the HT approach of GLS loses its
main attraction, and so why bother doing it - just stop at the previous
stage, with the within and between estimators. Julia can do this by hand.
Cheers,
Mark
> The complication is the std-error
> for IV will be downward bias if you don't apply a correction
> for the fact that left-hand-side variable used in the second
> step is generated using FE coefficient.
>
> In addition, other reasons can be the causes of Julia's
> results: (1) HT procedure leaves discretion to the researcher
> to choose the endogenous variable, one can try different
> specifications based on the theorical support of the model,
> (2) HT requires correlation between Z1 and Z2 subsets,
> additional instruments could improve the results if the
> correlation is low, but one needs to write the program (like
> Mark suggestion) to include the possibility of more variables
> and (3) some time-variant variables have slow time-variation.
> One could check if this is your case using -xtsum- and
> looking the within and between std deviations. I don't know
> how to solve the last one, if someone knows some reference I
> will appreciate that.
>
> Rodrigo.
>
>
> ----- Original Message -----
> From: "Schaffer, Mark E" <[email protected]>
> To: <[email protected]>
> Sent: Tuesday, May 02, 2006 9:53 AM
> Subject: RE: st: RE: Hausman taylor
>
>
> Rodrigo,
>
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of
> > Rodrigo A. Alfaro
> > Sent: 02 May 2006 14:31
> > To: [email protected]
> > Subject: Re: st: RE: Hausman taylor
> >
> > Dear Mark and Julia,
> >
> > HT does not generate consistent estimators for the presence
> > of autocorrelation and/or heteroskedasticity. Section 2.3 of
> > the paper gives you consistency analysis. As you can see the
> > consistent std errors are based on homoskedastic case.
>
> The key question is the consistency of the coefficient
> estimates. The SEs
> are, of course, inconsistent - I think that's why Julia is interested
> dealing with the heteroskedaticity and AC problem. But if
> the coefficient
> estimates are consistent, then the procedure I outlined below
> ought to
> generate consistent SEs.
>
> It's just like standard random effects GLS. The coefficient
> estimates are
> consistent but inefficient in the presence of heteroskedasticity or
> autocorrelation, but the usual GLS SEs are not consistent. Using
> cluster-robust SEs then solves the problem of obtaining
> consistent standard
> errors for a standard random effects estimation.
>
> I *think* same reasoning carries through with HT....
>
> --Mark
>
> >
> > In other words, you have to work with fixed-effects
> > estimators and IV-between-effects estimators, steps 1 and 2.
> > The goal is to build a HAC for this estimators. Note IV were
> > generated using FE, then the variance has to control for that.
> >
> > Best, Rodrigo.
> >
> >
> > ----- Original Message -----
> > From: "Schaffer, Mark E" <[email protected]>
> > To: <[email protected]>
> > Sent: Monday, May 01, 2006 5:47 PM
> > Subject: RE: st: RE: Hausman taylor
> >
> >
> > Julia,
> >
> > > -----Original Message-----
> > > From: [email protected]
> > > [mailto:[email protected]] On Behalf Of
> > Julia Spies
> > > Sent: 29 April 2006 10:20
> > > To: [email protected]
> > > Subject: RE: st: RE: Hausman taylor
> > >
> > > Sorry, what I meant was the the overid test stats is not
> > > significant and running a hausman test to compare HT with GLS
> > > is significant. I just mixed it up. Apologies!
> > >
> > > Julia
> > >
> > > > --- Urspr�ngliche Nachricht ---
> > > > Von: "Schaffer, Mark E" <[email protected]>
> > > > An: <[email protected]>
> > > > Betreff: RE: st: RE: Hausman taylor
> > > > Datum: Sat, 29 Apr 2006 07:39:07 +0100
> > > >
> > > > Julia,
> > > >
> > > > > -----Original Message-----
> > > > > From: [email protected]
> > > > > [mailto:[email protected]] On
> Behalf Of Julia
> > > > > Spies
> > > > > Sent: 28 April 2006 23:51
> > > > > To: [email protected]
> > > > > Subject: Re: st: RE: Hausman taylor
> > > > >
> > > > > Dear Mark,
> > > > >
> > > > > with "improving the model" I mean that the
> > over-identification test
> > > > > statistic comparing the FE model (I use areg with the
> cluster()
> > > > > option, since i identified autocorr. and
> > heteroskedasticity) with
> > > > > the HT estimation is significant, which means - if I
> > understand it
> > > > > correctly - that the correlation between the
> > explanatory variables
> > > > > and the individual effects has been removed by the
> > instrumentation.
> > > >
> > > > Apologies if I am misunderstanding what you have
> > reported, but it's
> > > > the other way around. A large and significant overid
> > stat is evidence
> > > > AGAINST your HT estimate. As usual with IV estimation,
> > under the null
> > > > that the orthogonality conditions are statisfied (the
> > instruments are
> > > > "valid"), the overid stat is distributed as chi-sq. A
> > big stat and
> > > > rejection of the null
> > > > suggests that your orthogonality conditions are not
> > satisfied, i.e.,
> > > > the instruments are not valid, i.e., your HT estimation
> > is misspecified.
> > > >
> > > > --Mark
> > > >
> > > > > Of course, since I have the odd parameter estimates in the
> > > > > instrumented time-invariant variables (which cannot be
> > estimated in
> > > > > the FE model), they don't enter the over-identification test.
> >
> > I'm not sure this is quite right. Hausman-Taylor (1981, p.
> > 1389) say that
> > "_all_ of the exogeneity information about X and Z is subject
> > to test by
> > this procedure" [emphasis in the original], meaning the
> > overid test they
> > give in their equation (2.2). Even though they aren't used
> > to calculate the
> > test statistic, all the orthogonality conditions are part of
> > the null, or so
> > they say.
> >
> > > > > My question therefore was whether autocorr. and
> > heteroskedasticity
> > > > > could produce these very high estimates or whether
> someone could
> > > > > think of any other source for the problem, and how I
> > can correct for
> > > > > it in the HT estimation.
> >
> > I am not sure, but the HT estimation may generate consistent
> > parameter
> > estimates even in the presence of autocorrelation and
> > heteroskedasticity,
> > and the problem may be that the var-cov estimate is wrong.
> > This needs
> > checking, but if so, then you could address the problem by using
> > cluster-robust standard errors. This would give you SEs that
> > are robust to
> > arbitrary autocorrelation and heteroskedasticity.
> >
> > Unfortunately, -xthtaylor- doesn't support the -cluster-
> > option. This be
> > might deliberate (i.e., the Stata programmers know that HT
> > won't generate
> > consistent parameter estimates in the presence of AC or het),
> > or it might
> > not. If not, then you could consider making a copy of
> > -xthtaylor- (call
> > it, say -xthtaylor2-) and editing it so that it forces
> cluster-robust
> > standard errors. The way to do this is to go to the block that says
> >
> > /* Hausman-Taylor estimator */
> >
> > A few lines under that is a call to regress, using the
> > old-fashioned syntax
> > (IVs in parentheses) for an IV estimation. You would add a
> > cluster option
> > to that line. It's currently
> >
> > reg `yvar_g' `list_g' `g_cons' /*
> > */ (`xvar1_dm' `xvar2_dm' /*
> > */ `xvar1_m' `zvar1' `g_cons') `wtopt' /*
> > */ if `touse', nocons
> >
> > and you would change this to
> >
> > reg `yvar_g' `list_g' `g_cons' /*
> > */ (`xvar1_dm' `xvar2_dm' /*
> > */ `xvar1_m' `zvar1' `g_cons') `wtopt' /*
> > */ if `touse', nocons cluster(`ivar')
> > ^^^^^^^^^^^^^^^
> > ^add this bit^
> >
> > You would also need to change the line at the top of the file from
> >
> > program xthtaylor, eclass byable(recall) sort
> >
> > to
> >
> > program xthtaylor2, eclass byable(recall) sort
> >
> > Might work. Worth a thought, anyway.
> >
> > HTH.
> >
> > Cheers,
> > Mark
> >
> > > > > Sorry for not making my point clear in the first
> e-mail. I will
> > > > > definitely try out Rodrigo's suggestions. Thank you
> > very much for
> > > > > the advice!
> > > > >
> > > > > Best regards,
> > > > > Julia
> > > > >
> > > > >
> > > > > > --- Urspr�ngliche Nachricht ---
> > > > > > Von: "Schaffer, Mark E" <[email protected]>
> > > > > > An: <[email protected]>
> > > > > > Betreff: st: RE: Hausman taylor
> > > > > > Datum: Fri, 28 Apr 2006 22:51:29 +0100
> > > > > >
> > > > > > Julia,
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: [email protected]
> > > > > > > [mailto:[email protected]] On
> > > Behalf Of Julia
> > > > > > > Spies
> > > > > > > Sent: 28 April 2006 12:48
> > > > > > > To: [email protected]
> > > > > > > Subject: st: Hausman taylor
> > > > > > >
> > > > > > > Dear all,
> > > > > > >
> > > > > > > I'm quite a beginner with Stata and i'm trying to
> > run a Hausman
> > > > > > > taylor regression. However, taking some (plausible)
> > time-invariant
> > > > > > > variables as endogeneous results in outrageous
> > parameter estimates
> > > > > > > for these variables.
> > > > > > > Nevertheless, the over-identification test suggests that
> > > > > > > instrumenting these variables has improved the model.
> > > > > >
> > > > > > This sounds odd ... what do you mean by "improving
> the model"?
> > > > > >
> > > > > > --Mark
> > > > > >
> > > > > > > Does
> > > > > > > anyone have an idea what the problem could be? I
> > > > > > > understand there is no option to correct for
> > heteroskedasticity
> > > > > > > and
> > > > > > > autocorrelation.
> > > > > > > Does anyone know how to do it manually?
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Julia
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/