Dear List Members -
I have panel data on student achievement. I want to estimate a model that
includes fixed effects for students, schools and time:
Yijt = ai + bj + ct + dXijt + eijt
where i indexes students, j indexes schools and t indexes time. The number
of time periods is small so I can include explicit time dummies to control
for time:
(1) Yijt = ai + bj + c1T1 + .... + cnTn + dXijt + eijt
However, the numbers of students and schools are both large, thus running
OLS with dummy variables is not feasible.
According to Greene (Econometric Analysis, 2nd ed., pp. 468-469) a solution
to the problem is to estimate the following:
(2) Y*ijt = f1T1* + ... + fnTn* + gX*ijt + e*ijt
where Y*ijt = Yij - (student's mean Y over time) - (school's mean Y over
all students) + (mean Y averaged over all students and schools). Likewise
for T1*...Tn* and X*.
It would seem that this approach could be implemented in Stata in either of
the following ways:
(a) explicitly calculate the de-meaned variables, Y*, T1*...Tn* and X* and
run .reg using these de-meaned variables
(b) take the difference between each observation and the school mean (ie.
(Yijt - (school mean over all students)), etc.) and run xtreg or areg with
student fixed effects.
I have run both models (a) and (b) on a small data set where I can also
estimate the model with explicit student, school and time dummies.
Both methods (a) and (b) yield coefficient estimates that are different
from one another and different from the model with explicit dummy variables
for all three effects. Bob Bifulco (U. Conn.) has been working on the same
problem with a different data set and comes up with the same inconsistent
results. A copy of my .do file and results follows. Any suggestions would
be greatly appreciated.
Tim
. * ******************************************************* ;
. * Set Panel Variables ;
. * ******************************************************* ;
. tsset student year ;
panel variable: student, 114 to 872489
time variable: year, 1999 to 2001, but with gaps
. * ******************************************************* ;
. * Create Differenced Variables ;
. * ******************************************************* ;
. * Determine obs. where one or more model variables are missing;
. egen nmiss = rmiss(nrtrgain charter nschools chgschl student instid) ;
. * Create student group means;
. bysort student:egen nrtrgain_m = mean(nrtrgain) if nmiss==0;
. bysort student:egen chgschl_m = mean(chgschl) if nmiss==0;
. bysort student:egen t2001_m = mean(t2001) if nmiss==0;
. * Create school group means;
. bysort instid:egen nrtrgain_n = mean(nrtrgain) if nmiss==0;
. bysort instid:egen chgschl_n = mean(chgschl) if nmiss==0;
. bysort instid:egen t2001_n = mean(t2001) if nmiss==0;
. *Create overall mean;
. egen nrtrgain_m2 = mean(nrtrgain) if nmiss==0;
. egen chgschl_m2 = mean(chgschl) if nmiss==0;
. egen t2001_m2 = mean(t2001) if nmiss==0;
. *Demean variables (from student mean);
. gen de_std_nrtrgain = nrtrgain - nrtrgain_m + nrtrgain_m2;
. gen de_std_chgschl = chgschl - chgschl_m + chgschl_m2;
. gen de_std_t2001 = t2001 - t2001_m + t2001_m2;
. *Demean variables (from school mean);
. gen de_sch_nrtrgain = nrtrgain - nrtrgain_n + nrtrgain_m2;
. gen de_sch_chgschl = chgschl - chgschl_n + chgschl_m2;
. gen de_sch_t2001 = t2001 - t2001_n + t2001_m2;
. *Demean variables (from school mean -- excluding overall mean);
. gen de_sch2_nrtrgain = nrtrgain - nrtrgain_n;
. gen de_sch2_chgschl = chgschl - chgschl_n;
. gen de_sch2_t2001 = t2001 - t2001_n;
. *Demean variables (from both student and school means);
. gen de_stdsch_nrtrgain = nrtrgain - nrtrgain_m - nrtrgain_n + nrtrgain_m2;
. gen de_stdsch_chgschl = chgschl - chgschl_m - chgschl_n + chgschl_m2;
. gen de_stdsch_t2001 = t2001 - t2001_m - t2001_n + t2001_m2;
. * ******************************************************* ;
. * Student, Time and School Fixed Effects ;
. * ******************************************************* ;
. * use dummies for student, time and school;
. xi: reg nrtrgain chgschl t2001
i.student i.instid if nmiss==0;
Number of obs = 2484
F(1567, 916) = 0.71
Prob > F = 1.0000
R-squared = 0.5481
Adj R-squared = -0.2249
------------------------------------------------------------------------------
nrtrgain | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
chgschl | -12.92698 3.263805 -3.96 0.000 -19.33238 -6.521578
t2001 | -7.521005 2.609728 -2.88 0.004 -12.64274 -2.399266
_cons | 167.5983 36.31196 4.62 0.000 96.33398 238.8626
------------------------------------------------------------------------------
. * demean all variables (including time dummies);
. * with respect to student means and school means ;
. * and run reg;
. reg de_stdsch_nrtrgain de_stdsch_chgschl de_stdsch_t2001
if nmiss==0;
Number of obs = 2484
F( 2, 2481) = 215.09
Adj R-squared = 0.1471
------------------------------------------------------------------------------
de_stdsch_~n | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
de_stdsch_~l | -20.81318 1.072797 -19.40 0.000 -22.91685
-18.70951
de_stds~2001 | -12.51805 .9687572 -12.92 0.000 -14.41771 -10.6184
_cons
| 2.93e-07 .4002361 0.00 1.000 -.7848309 .7848315
------------------------------------------------------------------------------
. * demean all variables (including time dummies);
. * with respect to school means (but don't add in overall mean) ;
. * and run areg;
. areg de_sch2_nrtrgain de_sch2_chgschl de_sch2_t2001
if nmiss==0, absorb(student);
Number of obs = 2484
F( 2, 919) = 14.33
Prob > F = 0.0000
R-squared = 0.4858
Adj R-squared = -0.3894
------------------------------------------------------------------------------
de_sch2_nr~n | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
de_sch2_ch~l | -11.81104 2.247681 -5.25 0.000 -16.22222
-7.399856
de_sch2~2001 | -6.620663 1.791643 -3.70 0.000 -10.13685 -3.104476
_cons
| -1.82e-07 .6314745 -0.00 1.000 -1.2393 1.239299
-------------+----------------------------------------------------------------
student | F(1562, 919) = 0.523 1.000 (1563 categories)
Tim R. Sass
Professor Voice: (850)644-7087
Department of Economics Fax: (850)644-4535
Florida State University E-mail: [email protected]
Tallahassee, FL 32306-2180 Internet: http://garnet.acns.fsu.edu/~tsass
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/