Kit and unnamed correspondent:
It will be even faster to use the -by: gen- construct, since that is
written in very fast C code. If you want a SD over a five-period
window within firm, just do something like:
tsset i t
sort i t
by i: g m=(y+l.y+l2.y+l3.y+l4.y)/5
by i: g v=(y-m)^2+(l.y-m)^2+(l2.y-m)^2+(l3.y-m)^2+(l4.y-m)^2
g sd=sqrt(v/4)
for some existing variable y (the latter 3 commands can easily be
condensed into one to further increase speed at some small cost in
readability). Or am I misunderstanding the nature of the problem?
On 8/5/08, Kit Baum <[email protected]> wrote:
> < >
> mvsumm is written in ado-file code. It probably should be rewritten to take
> advantage of Mata. Since -mvsumm- was implemented, Stata added the rolling:
> prefix. It might be faster to use -rolling- (which creates a separate
> dataset of summary statistics when combined with -summarize-) in this case.
>
>
> On Aug 5, 2008, at 02:33 , statalist-digest wrote:
>
> > I have calculated the standard deviation of firm-level revenue using the
> recommended mvsumm command such as:
> >
> > mvsumm Revenue, stat(sd) win(5) gen(rev5ysd) end
> >
> > I have the 64-bit version of Stata 10 SE (and a 64-bit computer). My
> sample size is over 1.1 million observations covering over 200,000 firms
> over 6 years. It took my computer about 24-hours to compute this statistic
> (although it worked just as advertised and gave me exactly the result I
> needed).
> >
> > Does anyone have any recommendations to speed up computing time since I
> need to compute about 8 more similar commands and don't want to tie up my
> Stata for over a week? Or do I just need to accept the calculation time
> since my data is so large?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/