[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: speed question: -collapse- vs -egen-
From
"Michael Blasnik" <[email protected]>
To
<[email protected]>
Subject
Re: st: speed question: -collapse- vs -egen-
Date
Sat, 26 Apr 2008 10:15:54 -0400
...
You can gain some speed in regular Stata code by not generating a separate
variable just to count the number of non-missings:
bysort rep78: gen mean=sum(price)/sum(price<.)
by rep78: keep if _n==_N
On my machine, this reduces the time required for the corrected Stas code from
17.3 to 13.8 s.
Michael Blasnik
----- Original Message -----
From: "Sergiy Radyakin" <[email protected]>
To: <[email protected]>
Sent: Friday, April 25, 2008 9:12 PM
Subject: Re: st: speed question: -collapse- vs -egen-
Hello All!
Jeph has asked about an efficient way of creating a dataset with means
of one variable over the categories of another variable. He suggested
two possible solutions and Stas added a third one.
Below I report performance of each of these methods and compare it
with the fourth: a plugin.
I use an expanded version of auto.dta and tabulate mean {price} by
different levels of {rep78}.
1. All methods resulted in the following table of results*
meanprice rep78
4564.5 1
5967.625 2
6429.233 3
6071.5 4
5913 5
2. The timing is as follows (Stata SE, Windows Server 2003, 32-bit)
1: 33.80 / 1 = 33.7960
2: 31.22 / 1 = 31.2190
3: 21.33 / 1 = 21.3280
4: 5.58 / 1 = 5.5780
<snip>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
© Copyright 1996–2025 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |