Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: questions about Fixed Effect models
From
Stata Email <[email protected]>
To
[email protected]
Subject
Re: st: RE: questions about Fixed Effect models
Date
Thu, 3 Mar 2011 13:41:18 -0300
Dear Jeff and Austin,
Thanks for your comments, they were really insightful.
I tried the specification Jeff mentioned. My question now is: after
estimating the model with and without the dummies for teachers, I have
to "predict newvar, residuals" and after that compare the variance of
such residuals (for both models). Is that correct? The smaller the
variance of the residuals in the model with teacher dummies (compared
with the variance in the model without dummies), more important is the
teacher effect, is that correct?
In fact, I have tried the model with dummies before (but without
clustering the students). But I was having problems because:
- I had too many teachers in my sample
- I want to interact the teachers dummies with another four dummies
(that differentiate students according to their previous proficiency
level) - this is the main purpose of the paper, I want to analyze to
what extent these "teacher effects" are different for different kind
of students
Since I was not being able to estimate the way described above, I
separated my regressions for 4 groups of students (based on their
position at the proficiency distribution in t-1) and I estimated using
the xtreg with teacher fixed effect.
But I believe that what Jeff proposed may help me to interpret the results.
Jeff, if you could send me the paper you are working about VAM, I
would really appreciate
Thanks again for your help
P.
On Fri, Feb 25, 2011 at 7:05 PM, Wooldridge, Jeffrey <[email protected]> wrote:
> Hi Austin:
>
> The situation here is a bit different: unlike in the case of using the sample variance for, say, student fixed effects based on five time periods -- the usual situation -- the teacher effects are hopefully estimated using more like 100 students. This makes them much more precise. It's true that in the usual case the naïve sample variance is systematically biased. I agree that there is an adjustment that can be used in the teacher effect case, but it will be less important.
>
> I've been doing a lot of simulations with co-authors on VAM estimation, and, although the setting is necessarily simplified, by far the most robust method for estimating the teacher effects is to use dynamic regression. Even when the theoretically best method is random effects on the first difference, the dynamic OLS estimator does almost as well. In other words, it is theoretically inconsistent but does a good job with the teacher effects. What Jesse's work does not recognize is a couple of important points:
>
> 1. Measurement error in the lagged y does not necessarily mean the teacher effects are badly estimated. My simulations with Cassie Guarino and Mark Reckase suggest otherwise.
>
> 2. In terms of the assignment of teachers to students, principals may very well be using the lagged observed score, which makes it the right thing to control for.
>
> 3. Jesse's argument is more from the perspective or the structural production function literature. My simulation work indicates this is much too limited. Even methods such as Arellano and Bond work much less well in certain simulations for estimating the teacher effects than just dynamic OLS.
>
> Our paper got rejected from the QJE and we are currently revising it. I'd be happy to send you an older version.
>
> Cheers, Jeff
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Austin Nichols
> Sent: Friday, February 25, 2011 4:51 PM
> To: [email protected]
> Subject: Re: st: RE: questions about Fixed Effect models
>
> Jeff--
> One problem when computing the variance of the teacher effects is that
> these are noisily estimated; see e.g. the discussion in -fese- on SSC:
> http://fmwww.bc.edu/repec/bocode/f/fese.html
> "
> [FE] standard errors are not usually computed in a fixed-effects
> regression, but we may need them. One example takes student test
> scores as the dependent variable and teacher assignments as the
> explanatory variables, as in Rothstein (2007), where the fixed effects
> measure the assumed additive effect of a teacher on her students' test
> scores. The variance of estimated fixed effects captures both the
> variance of true fixed effects and the variance of the estimator: the
> variance of true fixed effects (i.e. how disparate are teachers'
> apparent impacts on students' scores) can be estimated as the observed
> variance in estimated fixed effects less the best estimate for the
> variance of the estimator, which is the mean of squared standard
> errors.
> "
> Putting in lagged achievement (test score) is not in general a good
> idea, since this is measured with error--if you instrument for lagged
> achievement you will get a coef near one, whereas if you treat it as
> measured without error, you get a coef nearer 0.6 which is presumably
> biased downward by classical measurement error. The equation y_t = .6
> y_{t-1} + Xb may be subtracting off the wrong quantity, effectively
> regressing y_t - .6 y_{t-1} on X, which can introduce bias in
> estimates of b (add'l .4 times lagged ach in the error may be
> correlated with X). On the other hand, some would argue that "decay"
> means that the true coef on lagged achievement should not really be
> one.
>
> On Fri, Feb 25, 2011 at 4:27 PM, Wooldridge, Jeffrey <[email protected]> wrote:
>> Because I've been doing some work estimating teacher value added, I'll take a crack at this. First is an issue of terminology. While it is common to say things like "teacher fixed effects" when using student-level data, I'm not sure using the fixed effects commands in Stata (xtreg) is the right way to go. In fact, mechanically I'm not sure how you're doing it. Aren't the teacher effects the main quantities of interest? If so, you should just being using pooled OLS, putting in the lagged proficiency, and then including a full set of teacher dummy variables (with, presumably, a base teacher represented by the constant).
>>
>> None of the statistics that you mention would be relevant except perhaps the error variances. An interesting calculation is to compute the usual variance of the OLS residuals and then also compute the variance of the teacher effects. (You might have to export them or put them into a Stata matrix to do this.) This would tell you how important the teacher effect is relative to the overall variance in student proficiency.
>>
>> My Stata session would look something like this:
>>
>> xtset studentid year
>> gen score_1 = l.score
>> reg score score_1 i.teacherid i.year, cluster(studentid)
>>
>> (or just include a full set of teacher dummies if you have created them).
>>
>> It is also common to use student fixed or random effects on the differenced score:
>>
>> gen dscore = d.score
>> xtreg dscore i.teacherid i.year, fe cluster(studentid)
>>
>> Jeff W.
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Stata Email
>> Sent: Friday, February 25, 2011 3:05 PM
>> To: [email protected]
>> Subject: st: questions about Fixed Effect models
>>
>> Dear Statalist members
>>
>> I am new in panel data and I am working with fixed effect models. I
>> would like to confirm if I am doing the right thing
>>
>> When working with panel data, the data set is such that we have
>> information about individuals i and we observe these individuals
>> through different time periods t. My questions are
>>
>> 1) Which part of the Stata output shows me that the fixed effect is important?
>> 2) What does it mean exactly R-sq within? R-sq between?
>> 3) If I run a fixed effect model, the sigma-u is the std dev of the
>> residuals inside (within) each group of individuals i. So a higher
>> number means that I have more variability inside each group?
>> 4) sigma-e show the std dev of the residuals after excluding the
>> variability inside each group i? If that is true, a higher number
>> means that I have a big variability among groups i and therefore the
>> fixed effect is important?
>>
>> Now let me explain what kind of data set I have. I have a data set
>> with the proficiency level of students, followed for 5 years. But I
>> know who is the teacher for every student in all 5 years. I want to
>> calculate a teacher fixed effect (and I control for the proficiency
>> level from the previous year instead of having a student fixed
>> effect). My other questions are
>>
>> 5) My individuals i here are the teachers and, instead of having a
>> time t, I have students s with the same teacher
>> 6) All within statistic will refer the the differences among students
>> with the same teacher?
>>
>> I really appreciate any comment
>> Isabel
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/