Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: questions about Fixed Effect models
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: RE: questions about Fixed Effect models
Date
Fri, 25 Feb 2011 16:51:18 -0500
Jeff--
One problem when computing the variance of the teacher effects is that
these are noisily estimated; see e.g. the discussion in -fese- on SSC:
http://fmwww.bc.edu/repec/bocode/f/fese.html
"
[FE] standard errors are not usually computed in a fixed-effects
regression, but we may need them. One example takes student test
scores as the dependent variable and teacher assignments as the
explanatory variables, as in Rothstein (2007), where the fixed effects
measure the assumed additive effect of a teacher on her students' test
scores. The variance of estimated fixed effects captures both the
variance of true fixed effects and the variance of the estimator: the
variance of true fixed effects (i.e. how disparate are teachers'
apparent impacts on students' scores) can be estimated as the observed
variance in estimated fixed effects less the best estimate for the
variance of the estimator, which is the mean of squared standard
errors.
"
Putting in lagged achievement (test score) is not in general a good
idea, since this is measured with error--if you instrument for lagged
achievement you will get a coef near one, whereas if you treat it as
measured without error, you get a coef nearer 0.6 which is presumably
biased downward by classical measurement error. The equation y_t = .6
y_{t-1} + Xb may be subtracting off the wrong quantity, effectively
regressing y_t - .6 y_{t-1} on X, which can introduce bias in
estimates of b (add'l .4 times lagged ach in the error may be
correlated with X). On the other hand, some would argue that "decay"
means that the true coef on lagged achievement should not really be
one.
On Fri, Feb 25, 2011 at 4:27 PM, Wooldridge, Jeffrey <[email protected]> wrote:
> Because I've been doing some work estimating teacher value added, I'll take a crack at this. First is an issue of terminology. While it is common to say things like "teacher fixed effects" when using student-level data, I'm not sure using the fixed effects commands in Stata (xtreg) is the right way to go. In fact, mechanically I'm not sure how you're doing it. Aren't the teacher effects the main quantities of interest? If so, you should just being using pooled OLS, putting in the lagged proficiency, and then including a full set of teacher dummy variables (with, presumably, a base teacher represented by the constant).
>
> None of the statistics that you mention would be relevant except perhaps the error variances. An interesting calculation is to compute the usual variance of the OLS residuals and then also compute the variance of the teacher effects. (You might have to export them or put them into a Stata matrix to do this.) This would tell you how important the teacher effect is relative to the overall variance in student proficiency.
>
> My Stata session would look something like this:
>
> xtset studentid year
> gen score_1 = l.score
> reg score score_1 i.teacherid i.year, cluster(studentid)
>
> (or just include a full set of teacher dummies if you have created them).
>
> It is also common to use student fixed or random effects on the differenced score:
>
> gen dscore = d.score
> xtreg dscore i.teacherid i.year, fe cluster(studentid)
>
> Jeff W.
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Stata Email
> Sent: Friday, February 25, 2011 3:05 PM
> To: [email protected]
> Subject: st: questions about Fixed Effect models
>
> Dear Statalist members
>
> I am new in panel data and I am working with fixed effect models. I
> would like to confirm if I am doing the right thing
>
> When working with panel data, the data set is such that we have
> information about individuals i and we observe these individuals
> through different time periods t. My questions are
>
> 1) Which part of the Stata output shows me that the fixed effect is important?
> 2) What does it mean exactly R-sq within? R-sq between?
> 3) If I run a fixed effect model, the sigma-u is the std dev of the
> residuals inside (within) each group of individuals i. So a higher
> number means that I have more variability inside each group?
> 4) sigma-e show the std dev of the residuals after excluding the
> variability inside each group i? If that is true, a higher number
> means that I have a big variability among groups i and therefore the
> fixed effect is important?
>
> Now let me explain what kind of data set I have. I have a data set
> with the proficiency level of students, followed for 5 years. But I
> know who is the teacher for every student in all 5 years. I want to
> calculate a teacher fixed effect (and I control for the proficiency
> level from the previous year instead of having a student fixed
> effect). My other questions are
>
> 5) My individuals i here are the teachers and, instead of having a
> time t, I have students s with the same teacher
> 6) All within statistic will refer the the differences among students
> with the same teacher?
>
> I really appreciate any comment
> Isabel
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/