Sandra Kalev asked why she was getting different standard errors for parameter
estimates with -xtreg, fe- and -areg, absorb()-. She also asked about differences in
methods between -xtreg, fe-, -areg, absorb()- and -regress- with dummy (indicator)
variables for units.
----------begin excerpt from Sandra's post-------------
I am analyzing pooled data of organizations' workforce composition, using
STATA 7. I need to run robust fixed effects regression. I am thinking of
using AREG for that.
I have two - related - questions:
1. Why AREG and XTREG (FE) don't produce the same results? The
standard errors produced by AREG are smaller than those produced by XTREG.
2. What is the difference between running fixed effects using dummies for
each unit (e.g. xi: regress depvar indepvars i.org_name) and running fixed
effects using differences from the mean (as i think XTREG, FE does)?. Is
there a substantive difference or is it just about playing around with the
algebra? What does AREG use?
------------end excerpt--------------
I don't know why Sandra gets different standard errors, but it might be due to subtle
inconsistencies in the way she has set up the model in each case. The two methods
produce identical results in fictional datasets without a within-unit correlation and with
ca. 70% correlation within unit. (See the do-file below.)
As to the second question, I believe that Stata Corp. has a FAQ on this. There is no
difference between using dummy (indicator) variables for the units and performing OLS
regression with either -areg, absorb()- or -xtreg, fe-. I don't think that -xtreg, fe- uses
any different algorithm from the other two commands. I believe that all three
commands use essentially the same method, which is the same method as
randomized-blocks ANOVA (-anova depvar indvar blocks-).
Joseph Coveney
------------begin do file-------------
set more off
*
* uncorrelated case
*
clear
set obs 40
set seed 20021119
generate byte org_name=_n
forvalues tim=1/3 {
generate int dep`tim'= /*
*/round(100.0+15.0*invnorm(uniform()), 1)
}
reshape long dep, i(org_name) j(tim)
xi: xtreg dep i.tim, i(org_name) fe
xi: areg dep i.tim, absorb(org_name)
quietly tabulate org_name, generate(dum)
drop dum40
xi: regress dep i.tim dum*
test _Itim_2 _Itim_3
anova dep tim org_name
xi: areg dep i.tim, absorb(org_name) robust
xi: regress dep i.tim dum*, robust
*
* correlated case
*
drawnorm dep1 dep2 dep3, n(40) /*
*/ means(100 100 100) sd(15 15 15) /*
*/ corr(1.0, 0.7 0.7 \ 0.7, 1.0, 0.7 \ 0.7, 0.7, 1.0) /*
*/ clear
generate byte org_name=_n
reshape long dep, i(org_name) j(tim)
xi: xtreg dep i.tim, i(org_name) fe
xi: areg dep i.tim, absorb(org_name)
quietly tabulate org_name, generate(dum)
drop dum40
xi: regress dep i.tim dum*
test _Itim_2 _Itim_3
anova dep tim org_name
xi: areg dep i.tim, absorb(org_name) robust
xi: regress dep i.tim dum*, robust
exit
-------------end do file--------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/