[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: speed question: -collapse- vs -egen-

From	Jeph Herrin <[email protected]>
To	[email protected]
Subject	Re: st: speed question: -collapse- vs -egen-
Date	Sat, 26 Apr 2008 11:08:42 -0400

Thanks to Stas, Sergei and Michael for some tips on
speeding up things. Sergei's suggestion of a plugin
falls victim to what Kit points out is the limitation of
plugins - I'm running 64-bit Stata on 64-bit XP, so
his plugin won't help me.

In fact, since I develop a lot of code on my 32-bit desktop
before running on the 64-bit machine, it's hardly worth
my while to write my own plugins, even as a former C
programmer.

However, Kit has inspired me to try my hand at a Mata
solution.

thanks to all,
Jeph




Michael Blasnik wrote:

...

You can gain some speed in regular Stata code by not generating a separate variable just to count the number of non-missings:

bysort rep78: gen mean=sum(price)/sum(price<.)
by rep78: keep if _n==_N

On my machine, this reduces the time required for the corrected Stas code from 17.3 to 13.8 s.

Michael Blasnik

----- Original Message ----- From: "Sergiy Radyakin" <[email protected]>
To: <[email protected]>
Sent: Friday, April 25, 2008 9:12 PM
Subject: Re: st: speed question: -collapse- vs -egen-
Hello All!

Jeph has asked about an efficient way of creating a dataset with means
of one variable over the categories of another variable. He suggested
two possible solutions and Stas added a third one.

Below I report performance of each of these methods and compare it
with the fourth: a plugin.

I use an expanded version of auto.dta and tabulate mean {price} by
different levels of {rep78}.

1. All methods resulted in the following table of results*

   meanprice   rep78
      4564.5       1
    5967.625       2
    6429.233       3
      6071.5       4
        5913       5

2. The timing is as follows (Stata SE, Windows Server 2003, 32-bit)

  1:     33.80 /        1 =      33.7960
  2:     31.22 /        1 =      31.2190
  3:     21.33 /        1 =      21.3280
  4:      5.58 /        1 =       5.5780
<snip>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: speed question: -collapse- vs -egen-
  - From: Jeph Herrin <[email protected]>
- Re: st: speed question: -collapse- vs -egen-
  - From: "Stas Kolenikov" <[email protected]>
- Re: st: speed question: -collapse- vs -egen-
  - From: "Sergiy Radyakin" <[email protected]>
- Re: st: speed question: -collapse- vs -egen-
  - From: "Michael Blasnik" <[email protected]>

Prev by Date: Re: st: SUR and a system of logit models
Next by Date: st: Logs and graphics queries
Previous by thread: Re: st: speed question: -collapse- vs -egen-
Next by thread: RE: st: speed question: -collapse- vs -egen-
Index(es):
- Date
- Thread