Hi David and Vince,
Thanks for your insights and helpful comments. This was a good learning
experience..
Anirban
______________________________________
ANIRBAN BASU
Doctoral Student
Harris School of Public Policy Studies
University of Chicago
(312) 563 0907 (H)
________________________________________________________________
On Wed, 26 Jun 2002, Vince Wiggins, StataCorp wrote:
> I have one additional comment in the continuing thread comparing the results
> of -regress-, -xtreg, fe-, and -xtreg , re-.
>
> While I agree with the comparisons between the models presented by Mark
> Schaffer <[email protected]> and David Drukker <[email protected]>, there
> is a more mundane reason why the example presented by Anirban Basu
> <[email protected]> elicits virtually identical estimates from
> -regress-, -xtreg, fe-, and -xtreg, re-. The short answer is they have to be
> identical, at least to machine precision of the computations.
>
> Anirban Basu asks us to generate data in the following manner,
>
> . mat C= (1, 0.6, 0.6, 0.6 \ 0.6, 1, 0.6, 0.6 \ 0.6, 0.6, 1, 0.6 \ /*
> */ 0.6, 0.6, 0.6, 1)
> . drawnorm y1 y2 y3 y4, n(1000) means(1 3 4 7) corr(C)
> . gen id=_n
> . reshape long y , i(id) j(time)
>
> Anirban is using -drawnorm- to create 4 correlated variables and then
> -reshape- to turn these into a panel data with 4 values for a single y. This
> is a fine way to create data with a random effect. Here are the first three
> panels:
>
> . list in 1/12
>
> id time y
> 1. 1 1 -.0939699
> 2. 1 2 2.265574
> 3. 1 3 2.323656
> 4. 1 4 6.053069
> 5. 2 1 1.367081
> 6. 2 2 3.062155
> 7. 2 3 4.830178
> 8. 2 4 7.105754
> 9. 3 1 1.145398
> 10. 3 2 4.087784
> 11. 3 3 3.99791
> 12. 3 4 6.942679
>
>
> Anirban, the asks us to try the OLS, fixed-effects, and random-effects
> estimators on this data by typing,
>
> . regress y time
>
> . xtreg y time , i(id) fe
> and,
> . xtreg y time , i(id) re
>
> What is unusual about this model is that we are including -time- as a
> regressor. Note that we have perfectly balanced panels of 4 observations
> each, and that the variable -time- exactly repeats itself -- counting 1, 2, 3,
> 4 in each panel.
>
> What does this mean for the fixed-effects (FE) transformation? The FE
> transformation just subtracts the panel mean for each variable (dependent and
> independent) from each value. The panel mean for time is 2.5 in every panel.
> This means the the FE transformation just subtracts a constant value from
> -time-. Subtracting a constant from a regressor does not have any effect on
> its estimated coefficient.
>
> But wait, we also subtracted the panel means from the dependent variable y and
> those means were not the same for each panel. As it turns out, when panels
> are balanced, the FE transformation of any variable produces a variable that
> has a regression coefficient of exactly 1 when regressed against the
> untransformed variable. Thus, the relationship with a variable that has not
> been transformed (like -time-, that had only a constant subtracted) remains
> exactly the same.
>
> So, with only a single independent variable that repeats exactly in each
> balanced panel, OLS and fixed-effects regression will produce the same
> estimate of the coefficient on the regressor (within machine tolerance of the
> different computations performed).
>
> Side-note: While I was aware of the behaviour of variables that repeat within
> panel for balanced panels, I hadn't previously considered why the FE
> transformation of the dependent variable has no effect. A little scribbling
> on the white board from Bobby Gutierrez <[email protected]> shows that when
> the FE transformation is expressed in matrix form it is idempotent for balanced
> panels. That causes the transformation to essentially fall out of regression
> of y on y-transformed leaving a coefficient of 1.
>
> What about the random-effects (RE) estimator? The GLS random-effects
> estimator is just a matrix-weighted combination of the FE estimator and the
> between-effects (BE) estimator. The BE estimator is a regression of the
> panel-level mean of each variable (again, dependent and independent). As we
> saw above, the panel-level mean for -time- is a constant 2.5 in every panel
> and thus is collinear with the constant. This means that the between
> estimator cannot estimate B_time and provides no additional information for
> this coefficient. It has no contribution to the RE estimator. So, the RE
> estimator must be identical to the FE estimator in a model with a single
> covariate that repeats exactly within each balanced panel.
>
>
> -- Vince
> [email protected]
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/