Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Austin Nichols <austinnichols@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: xt: unit-specific trends |
Date | Thu, 19 Apr 2012 09:48:19 -0400 |
László Sándor <sandorl@gmail.com>: No need to run regressions, loop, etc. You can just use a little algebra and by: http://www.stata.com/statalist/archive/2012-02/msg01108.html http://www.stata.com/statalist/archive/2008-10/msg00136.html though it will be faster and more accurate in Mata. If you decide to move into Mata, see also e.g. http://www.stata.com/statalist/archive/2009-05/msg00841.html 2012/4/19 László Sándor <sandorl@gmail.com>: > Quick comments on this: > > I forgot to flag that the residual variable need to exist beforehand > for -genbump- below, this is only replacing values of it. > > More importantly: The operation is still far, far from linear in the > number of individuals (N in the panel — T is fixed). I could again > finish a 1% subsample in around 10 minutes or so, but my bold attempt > at 10% overnight still only finished 4 out of the 8 variables to be > transformed this way in 10 or 11 hours. > > Maybe caching and memory is an issue here, but if anybody (StataCorp?) > had a comment on this otherwise, that would be helpful. > > Maybe firing up _regress and _predict all the time is very costly? Or > the marksample is not fast enough with the by option? (Does the code > know that once it finished with seven consecutive rows there is > nothing to check further below "whether" `touse' is 1 anywhere else? I > guessed byable commands produce efficient subscripting for some > underlying Mata code…) Or even the byable command does not use MP > resources efficiently? (Still, even remaining serial, the speed-up > could be much closer to linear, no?) > > I thought individual-specific trends are almost as trendy nowadays as > fixed-effects — I wonder if they could be done much faster. > > Thanks, > > Laszlo > > 2012/4/18 László Sándor <sandorl@gmail.com>: >> In case anyone cares, this is what I came up with. (Detrends, demeans, >> and also allows for a level shift.) And this is faster, as I expected. >> >> program define genbump, byable(recall, noheader) >> version 11 >> syntax =/exp [if] [in], trend(varname) bump(varname) resid(varname) >> marksample touse, novarlist >> tempvar res >> quietly { >> _regress `exp' `trend' `bump' if `touse' >> _predict `res', resid >> replace `resid' = `res'+_b[`bump']*`bump' if `touse' >> } >> end >> >> >> 2012/4/18 László Sándor <sandorl@gmail.com>: >>> Thanks, Nick, >>> >>> I left out a crucial part: I need to run it for observations in the >>> 10K magnitude (full sample: 400K, but I also try to sample down). >>> >>> I just had the 200 / 4 mins as a measure of speed. >>> >>> I would really love to see this speed up. >>> >>> So I should make the residual-generation a separate command, and make >>> it byable (but no egen), then? Any other trick up your sleeve? >>> >>> Gratefully, as always, >>> >>> Laszlo >>> >>> On Wed, Apr 18, 2012 at 7:56 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>>> If a total task takes 3-4 minutes, dots to show progress are >>>> pointless, in my view. >>>> >>>> -egen- is for convenience. Writing -egen- will not speed up; it will >>>> just slow things down. Nick >>>> >>>> 2012/4/19 László Sándor <sandorl@gmail.com>: >>>>> Or a quick idea: Shall I write an -egen- extension instead? Or all >>>>> benefits would come from its byability anyway? >>>>> >>>>> 2012/4/18 László Sándor <sandorl@gmail.com>: >>>>>> Let me get back to this now that I know how fast I am doing using -_dots-. >>>>>> >>>>>> Now I know it takes 3-4 minutes to loop through 200 cases while all I >>>>>> do each time is a trivial regression on 4-7 observations and >>>>>> predicting the residuals. >>>>>> >>>>>> I would greatly welcome suggestions on how to speed this up relative >>>>>> to the code below. Most likely checking all cases for the -if- >>>>>> condition when only few would satisfy and they could come in blocks >>>>>> after a single sort could help things but I am out of ideas how to do >>>>>> that. Making the code "byable" would at least use some features of MP? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Laszlo >>>>>> >>>>>> sum nid, d >>>>>> _dots 0 >>>>>> forval i = 1/`r(max)' { >>>>>> foreach v of varlist assets liabs netassets koejd { >>>>>> cap reg `v' year post if nid == `i' >>>>>> if _rc == 0 { >>>>>> predict resid, resid >>>>>> qui replace r`v' = resid + _b[post]*post if e(sample) >>>>>> drop resid >>>>>> } >>>>>> } >>>>>> _dots `i' 0 >>>>>> } >>>>>> >>>>>> 2012/4/13 László Sándor <sandorl@gmail.com>: >>>>>>> Hi all, >>>>>>> >>>>>>> I am trying to demean and detrend my panel data allowing for unit >>>>>>> specific trends (using Stata 11.0 MP for Windows). I found some >>>>>>> previous posts about this, but I am not satisfied with the speed of >>>>>>> the solutions. I would be most happy with a "byable" solution, like >>>>>>> this pseudocode: >>>>>>> >>>>>>> bys id: { >>>>>>> reg var t >>>>>>> pred dtrended_var, res >>>>>>> } >>>>>>> >>>>>>> I know this is not possible. However, looping through my ids and if >>>>>>> conditions is not feasible either (or I collect them into a local with >>>>>>> -levelsof-?). Actually, with all the if conditions, it is not >>>>>>> attractive either, let alone feasible. (Or if I sort by id, I can use >>>>>>> in conditions in the balanced subset, which I presume to be much >>>>>>> faster?) >>>>>>> >>>>>>> Or shall I just loop over a new id that will be consecutive integers >>>>>>> if I -egen, group- the old id (or do the same with ins)? >>>>>>> >>>>>>> I had some hopes about -xtdata- or -areg-, but to no avail. Yet I look >>>>>>> for some guidance on doing this the right way, if even the simple >>>>>>> -areg- could have been made faster by "orders of magnitude" from Stata >>>>>>> 11 to 12… >>>>>>> >>>>>>> Thank you for any thoughts, >>>>>>> >>>>>>> Laszlo * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/