Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Regression with about 5000 (dummy) variables
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: Regression with about 5000 (dummy) variables
Date
Thu, 19 Apr 2012 11:16:29 -0400
John Antonakis <[email protected]>:
The poster asked about multiple dimensions of fixed effects--how does
the advice below relate?
The approach shown actually adds to the size of the matrix to be inverted.
You assert that
"This will save you on degrees of freedom and computational requirements."
--can you clarify that claim?
Your
xtreg y x1-x4 cl_x1-cl_x4, cluster(panelvar)
is nearly the same as
xtreg y x1-x4, fe robust
right? Note that inference is not identical, as the RE estimator
does not "know" the means are estimated.
On Thu, Apr 19, 2012 at 10:57 AM, John Antonakis <[email protected]> wrote:
> Hi:
>
> Let me let you in on a trick that is relatively unknown.
>
> One way around the problem of a huge amount of dummy variables is to use the
> Mundlak procedure:
>
> Mundlak, Y. (1978). Pooling of Time-Series and Cross-Section Data.
> Econometrica, 46(1), 69-85.
>
> ....for an intuitive explanation, see:
>
> Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making
> causal claims: A review and recommendations. The Leadership Quarterly,
> 21(6). 1086-1120. http://www.hec.unil.ch/jantonakis/Causal_Claims.pdf
>
> Basically, for each time varying independent variable (x1-x4), take the
> cluster mean and include that in the regression. That is, do:
>
> foreach var of varlist x1-x4 {
> bys panelvar: egen cl_`var'=mean(`var')
> }
>
> Then, run your regression like this:
>
> xtreg y x1-x4 cl_x1-cl_x4, cluster(panelvar)
>
> The Hausman test for fixed- versus random-effects is:
>
> testparm cl_x1-cl_x4
>
> This will save you on degrees of freedom and computational requirements.
> This estimator is consistent. Try it out with a subsample of your dataset
> to see. Many econometricians have been amazed by this.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/