Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: how can i make my loop run faster?
From
Partho Sarkar <[email protected]>
To
[email protected]
Subject
Re: st: how can i make my loop run faster?
Date
Mon, 19 Sep 2011 22:18:47 +0530
Sorry, I made a mistake in that post. -rolling- will only work on one
panel at a time, So you could do :
levelsof firm==`z', local(firms)
foreach j of local firms {
rolling _b if firm=`j',w(20) saving(tryroll`j'): regress y x
}
Partho
On Mon, Sep 19, 2011 at 10:02 PM, Partho Sarkar
<[email protected]> wrote:
> I think the -rolling- time series command can help do this. E.g.
> once you a) tsset the panel as before, and b) sort the dataset by
> -sort panelvar datevar-
>
> rolling _b,w(20) saving(tryroll): regress y x
>
> would divide up your entire time span into overlapping windows of
> width 20, run a regression for each panel in each window, and save the
> panel ids, the start & end of each window, and the regression
> coefficients, in a Stata data file called "tryroll".
>
> See -help rolling- and the manual entry for details & examples. Given
> your special requirements, you will probably have to do this in 2 or
> more steps, and manipulate the results further to get exactly what you
> want.
>
> Partho
>
> On Mon, Sep 19, 2011 at 8:20 PM, Stefano Rossi <[email protected]> wrote:
>> Partho,
>>
>> Many thanks for this, it is very helpful.
>>
>> This raises one question, though: a crucial part of my procedure is that I need to run regressions only on 12 observations for each firm-period pair; that is, if a firm i has data back to period t=-50, say, I still have to run the regression only on the 12 observations from -1 to -12, ignoring all others. This worked well with my loop, but I do not see readily how to do this with statsby. Can you please advise?
>>
>> Best,
>>
>> Stefano
>>
>>
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Partho Sarkar
>> Sent: Monday, September 19, 2011 1:06 AM
>> To: [email protected]
>> Subject: Re: st: how can i make my loop run faster?
>>
>> Stefano
>>
>> You don't seem to be actually making any use of the panel structure of
>> the data. Stata has very neat built-in procedures for dealing with
>> such data.
>>
>> Very briefly, 2 pointers (I am ignoring the special wrinkle in your
>> problem that you want to run 20 seoarate regressions for each "firm
>> i-period t" pair- you would have to adapt the procedure accordingly):
>>
>> A. I would use -tsfill, full- to fill in the time values and balance the panel.
>>
>> B. If you use tsset panelvar datavar (or xtset), where panelvar is
>> your panel identifier, and datevar the date variable, you can use:
>>
>> statsby _b _se, by(panelvar): regress y x
>>
>> to do all the regressions in one go (assuming a single regression for
>> each "firm i-period t" pair), rather than separately within a long
>> loop. You can collect the results saved in r-class macros, as with
>> _b & _se above. See -help statsby-
>>
>> Having said all that, I have never tried to run a set of regressions
>> with 30,000 firms & 200 time periods in a single run of a program!!!
>> I suspect this will be painfully slow no matter how efficient your
>> code. An obvious alternative would be to split the firms into, say, 10
>> subsets, do the regression for each subset, and put all the results
>> together.
>>
>> Hope this helps
>>
>> Partho Sarkar
>> Consultant Econometrician
>> Indicus Analytics
>> New Delhi, India
>>
>>
>> On Mon, Sep 19, 2011 at 5:22 AM, Stefano Rossi <[email protected]> wrote:
>>> Dear Statalist Users,
>>>
>>> I wonder if you can help me make a faster loop?
>>> I have an unbalanced panel of about 30,000 firms and 200 periods, and for each "firm i-period t" pair I need to run 10 regressions on the 12 observations from t-1 to t-12 of the same firm i, and another 10 regressions on the observations from t+1 to t+12 of the same firm i. I have come up with the following program, which works well as it does what it should do, but it is very slow (due to the many ifs I suspect) - here's a simplified version of it with just two regressions:
>>>
>>> forval z = 1/30000 {
>>> levelsof period if firm==`z', local(sample)
>>> foreach j of local sample {
>>> local k = `j' - 13
>>> capture reg y x if firm ==`z' & period<`j' & period>`k' & indicator==1
>>> if _rc==0 {
>>> predict y_hat, xb
>>> replace before = y_hat[_n-1] if firm == `z' & period == `j'
>>> drop y_hat
>>> }
>>> local w = `j' + 13
>>> capture reg y x if firm ==`z' & period>`j' & period<`w' & indicator==1
>>> if _rc==0 {
>>> predict y_hat, xb
>>> replace after = y_hat[_n+1] if firm == `z' & period == `j'
>>> drop y_hat
>>> }
>>> }
>>> }
>>>
>>> Right now, it takes several minutes for each firm, so if I run it for the whole sample it would take weeks.
>>> Is there any way to make it (a lot) faster?
>>>
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/