Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Slow -rolling- regressions on panel data
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: Slow -rolling- regressions on panel data
Date
Mon, 26 Sep 2011 11:13:37 -0400
Nick Cox <[email protected]>:
Except, not necessarily.
The link I provided to
http://www.stata.com/statalist/archive/2008-10/msg00136.html
indicates how you can generate 7 variables, including the regression
coefficients,
without using -if- or -in- restrictions.
Suppose the window is 5 periods instead of 16, and try:
webuse grunfeld, clear
ren mvalue y
set type double
g x=l.y
g xy=x*y
g xx=x^2
g sumxx=xx+l.xx+l2.xx+l3.xx+l4.xx
g sumxy=xy+l.xy+l2.xy+l3.xy+l4.xy
g sumx=x+l.x+l2.x+l3.x+l4.x
g sumy=y+l.y+l2.y+l3.y+l4.y
g b=(5*sumxy-sumx*sumy)/(5*sumxx-sumx^2)
reg y x in 2/6, nohe
reg y x in 3/7, nohe
l com y x b in 1/7
On Mon, Sep 26, 2011 at 10:50 AM, Nick Cox <[email protected]> wrote:
> Whatever you do here is 250,000 regressions. That's the nub.
>
> Something that is going to be slow, almost always, is
>
> if level_firm == `l'
>
> as Stata will just go through all observations testing whether they qualify.
>
> Nick
>
> On Mon, Sep 26, 2011 at 3:37 PM, Richard Herron
> <[email protected]> wrote:
>> I am using -rolling- for rolling regressions on panel data, but it is
>> exceedingly slow. I found a Statalist thread
>> (http://www.stata.com/statalist/archive/2009-09/msg01239.html) with a
>> more manual solution, but it is equally slow (both are too slow to run
>> to completion in a reasonable amount of time).
>>
>> Is -regress- the bottleneck? I only want the AR(1) coefficient; is
>> there a different approach I should take? Are rolling
>> regressions/calculations best done in different software?
>>
>> Thanks!
>>
>> * ----- begin code -----
>> * generate data
>> clear
>> set obs 250000
>> egen firm = seq(), from(1) to(2500) block(100)
>> egen date = seq(), from(1) to(100)
>> generate eps = 1 + rnormal()
>> sort firm date
>> tsset firm date
>>
>> * generate variables for rolling regressions
>> bysort firm (date): generate l_eps = eps[_n - 1]
>> label variable l_eps "One-Quarter Lagged EPS"
>> bysort firm (date): generate end = _n
>> label variable end "Firm-Quarter (for rolling regressions)"
>>
>> * the simple approach is very slow
>> rolling _b, window(16) clear: regress eps l_eps, noconstant
>>
>> * and the approach from an old Statalist thread
>> http://www.stata.com/statalist/archive/2009-09/msg01239.html) is
>> equally slow
>> tempfile tempfile_rr
>> egen level_firm = group(firm)
>> summarize level_firm, meanonly
>> forvalues l = 1/`r(max)' {
>> rolling if level_firm == `l'
>> ///
>> , window(16) keep(firm) ///
>> saving(`tempfile_rr', replace) nodots ///
>> : regress eps l_eps, noconstant
>> merge 1:1 firm end using "`tempfile_rr'" ///
>> , update replace nogenerate keepusing(firm end _b_l_eps)
>> }
>> label variable _b_l_eps "Earnings Persistence"
>> * ----- end code -----
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/