Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Austin Nichols <austinnichols@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Panel data: large number of linear time trends |
Date | Thu, 10 May 2012 10:05:22 -0400 |
ron alfieri <ron.alfieri18@gmail.com> You are using different samples in different detrending regressions. It is easy to constrain samples, though: clear all prog mydetrend, rclass byable(recall) version 10.1 syntax varlist [if] [in], DETrend(varname) tempvar eps marksample touse regress `varlist' if `touse' predict double `eps' if e(sample), res replace `detrend' = `eps' if e(sample) end webuse grunfeld replace invest = . in 4 replace invest = . in 6 replace mvalue = . in 8 replace mvalue = . in 13 replace invest = . in 6 replace invest = . in 7 replace invest = . in 11 replace invest = . in 15 replace invest = . in 21 g i_dtr = . g mv_dtr = . g m=mvalue if !mi(invest) g i=invest if !mi(mvalue) by company: mydetrend i year, det(i_dtr) by company: mydetrend m year, det(mv_dtr) areg mv_dtr i_dtr, abs(company) reg mvalue c.invest c.year##i.company On Wed, May 9, 2012 at 8:15 PM, ron alfieri <ron.alfieri18@gmail.com> wrote: > Thank you Austin! It seems that the differences are due to my panel > being unbalanced. Using the prior example you can see that both > methods produce different results when dropping some observations to > make the panel unbalanced. > > clear all > prog mydetrend, rclass byable(recall) > version 10.1 > syntax varlist [if] [in], DETrend(varname) > tempvar eps > marksample touse > regress `varlist' if `touse' > predict double `eps' if e(sample), res > replace `detrend' = `eps' if e(sample) > end > > webuse grunfeld > replace invest = . in 4 > replace invest = . in 6 > replace mvalue = . in 8 > replace mvalue = . in 13 > replace invest = . in 6 > replace invest = . in 7 > replace invest = . in 11 > replace invest = . in 15 > replace invest = . in 21 > > g i_dtr = . > g mv_dtr = . > by company: mydetrend invest year, det(i_dtr) > by company: mydetrend mvalue year, det(mv_dtr) > areg mv_dtr invest, abs(company) > areg mv_dtr i_dtr, abs(company) > reg mvalue c.invest c.year##i.company > > > If you can run the interacted version, e.g. > reg mvalue c.invest c.year##i.company > in the link cited, why wouldn't you? > > Because I have too many zip codes to include them all as covariates. > > Thanks again. > > On Wed, May 9, 2012 at 4:43 PM, Austin Nichols <austinnichols@gmail.com> wrote: >> ron alfieri <ron.alfieri18@gmail.com>: >> You don't show what you typed, and it is not clear what you mean by: >> "an interaction between the fixed effect for each zip code and a >> linear time trend" >> --if you mean you interacted a full set of dummies with time, then I >> would expect the same point estimates in both. >> >> Are you neglecting to mention other covariates perhaps? >> >> If you can run the interacted version, e.g. >> reg mvalue c.invest c.year##i.company >> in the link cited, why wouldn't you? >> >> On Wed, May 9, 2012 at 3:26 PM, ron alfieri <ron.alfieri18@gmail.com> wrote: >>> I am trying to estimate a panel data model with a large number of >>> unit-specific linear time trends (one for each zip code). >>> >>> I am using the method proposed here: >>> >>> http://www.stata.com/statalist/archive/2012-02/msg01108.html >>> >>> Using a subset of my data, I tried using your method and then compared >>> the results to the results from a model where I include zip-code >>> specific time trends by adding as covariates an interaction between >>> the fixed effect for each zip code and a linear time trend. >>> >>> The results are very similar, but not identical. >>> >>> This is how I am interpreting the differences. When de-trending the >>> data for one zip-code at a time your code uses only the data points >>> from that zip code. However, all data points are used when estimating >>> zip-code specific trends by adding as covariates the interactions >>> between the fixed effect for each zip code and a linear trend (with >>> “all data points” I mean even the data points where these interactions >>> take the value of zero that are not used when doing it one zip code at >>> a time). >>> >>> I would appreciate any comments on whether I am interpreting the >>> differences between these two methods correctly. If anyone has an >>> insight on whether one of the methods is more “appropriate” than the >>> other that would be great. >>> >>> Aaron * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/