Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Richard Herron <richard.c.herron@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Equivalent of Excel's COUNTIF |
Date | Wed, 16 Nov 2011 14:30:39 -0500 |
The -reshape- solution is faster than the loop solution (timer 1 vs timer 2 below). With 1e5 individuals the loop solution was beyond my patience. timer list 1: 0.12 / 1 = 0.1240 2: 13.42 / 1 = 13.4210 I had to modify our solutions a little. * begin code timer clear timer on 1 clear set obs 10000 set seed 10101 generate long id = _n generate datein = runiform()*5000 generate dateout = datein + runiform()*15 reshape long date, i(id) j(inout) string sort date tempvar change generate int `change' = cond(inout == "in", 1, -1) generate int total = sum(`change') - 1 timer off 1 timer list timer on 2 clear set obs 10000 set seed 10101 generate arrival = runiform()*5000 generate discharge = arrival + runiform()*15 gen long npatients = . gen long _num_discharged = . sort arrival forvalues k=1/`=_N' { quietly replace _num_discharged = sum( discharge <= arrival[`k'] ) quietly replace npatients = (_n-1) - _num_discharged[`k'] in `k' } timer off 2 timer list * end code On Wed, Nov 16, 2011 at 11:44, Stas Kolenikov <skolenik@gmail.com> wrote: > On Wed, Nov 16, 2011 at 11:04 AM, Richard Herron > <richard.c.herron@gmail.com> wrote: >> Here is an alternative solution with -reshape-, -cond-, and -sum-. > > Cute solution! > >> The last two functions should be fast at any scale, but I don't have >> enough experience with Stata to know if -reshape- is faster than a >> loop. > > That's easy to check: set obs 10M instead of 10, and see what will be > faster (and whether the -reshape- will start breaking down with large > data sets; it might or it might not). -reshape- appears to be using a > lot of I/O with explicit -use-, -save- and -merge- in the code; I > thought this would have been written in C or Mata -- there's a > reshape() function in Mata. > > -- > Stas Kolenikov, also found at http://stas.kolenikov.name > Small print: I use this email account for mailing lists only. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/