Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Overriding a loop if 0 observations using tabstat
From
Jeph Herrin <[email protected]>
To
[email protected]
Subject
Re: st: Overriding a loop if 0 observations using tabstat
Date
Tue, 27 Apr 2010 14:59:31 -0400
This is 64bit MP 2 on Windows 7 with 8G ram.
The processor is an AMD Phenom II with 3.20GHz clock speed.
cheers,
J
Martin Weiss wrote:
<>
Jeph, out of curiosity, what kind of equipment is it that throws up these
numbers? Mine is 64 bit MP 4 on Windows 7 with 4G Ram.
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Jeph Herrin
Sent: Dienstag, 27. April 2010 20:27
To: [email protected]
Subject: Re: st: Overriding a loop if 0 observations using tabstat
t=48.90; t=60.45; t=72.30. :>
Martin Weiss wrote:
<>
t=100.28; t=207.58; t=241.55. :-)
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Dienstag, 27. April 2010 19:08
To: [email protected]
Subject: RE: st: Overriding a loop if 0 observations using tabstat
Good question. I decided to do some timings to support -- or rebut -- my
feeling that -count- which just counts should be faster than -summarize,
meanonly- which does other stuff too and in turn than -summarize- which
does
other stuff too. But although that's the order the timings are closer than
I
guessed. Still, doing anything the quickest way does no harm and may give
valuable speed-up for large problems.
Here is one test script. Compare your experiences:
clear
set obs 100000
set seed 2803
gen y = runiform()
set rmsg on
qui forval i = 1/10000 {
count if y > 0.5
}
qui forval i = 1/10000 {
su y if y > 0.5, meanonly
}
qui forval i = 1/10000 {
su y if y > 0.5
}
My timings were t=187.49; 254.49; 313.38, which no doubt shows up the
Mesolithic age of my machine.
Nick
[email protected]
Martin Weiss
" As a small detail of efficiency, I would always recommend -count- rather
than -summarize- for the purpose here."
My earlier code did use -count-... What makes this thing more efficient,
though? Both are built-in, so they probably enjoy a big advantage over
everybody else anyway. So I guess the reason for your preference is the
fact
that -count- calculates fewer results than -su, mean-?
Nick Cox
A secondary theme here is that this kind of code gets very difficult to
read, which makes it difficult to maintain and debug.
I note that the condition
intab1 == 1 & admit_ic == 1 & btwg < .
is common to all the -summarize- and -tabstat- commands. That being so,
you
could get that out of the way like this
preserve
keep if intab1 == 1 & admit_ic == 1 & btwg < .
<stuff>
restore
Your -tabstat- options that are constant can be put in a little bag:
local opts stat(n mean median p25 p75 min max) col(stat) f(%9.0g) notot
nosep
Now <stuff> can be rewritten
forv i = 0/5 {
foreach y in male singlet {
forv s = 0/1 {
di "myga==`i' & `y'==`s'"
qui su bwtg if myga==`i' & `y'
if r(N) != 0 {
tabstat bwtg if myga==`i', `opts' by(`y')
}
}
}
}
Now it is easier to see what is going on. I added some cosmetic changes
too,
which this horrible mailer may well reverse.
One puzzle: Did you mean to add the condition "& `y'" to the -summarize-?
It
means the same as
& `y' != 0
-- which may or may not be what you want.
As a small detail of efficiency, I would always recommend -count- rather
than -summarize- for the purpose here.
Nick
[email protected]
sara khan
Many thanks Maarten for your advice. I managed to resolve it with the
following code:
forv i=0/5 {
foreach y in male singlet{
forv s=0/1{
di "myga==`i' & `y'==`s'"
qui su bwtg if myga==`i' & intab1==1 & admit_ic==1 & bwtg<. & `y'
if r(N)!=0{
tabstat bwtg if myga==`i' & intab1==1 & admit_ic==1 & bwtg<., stat(n
mean median p25 p75 min max ) by(`y') col(stat) f(%9.0g) notot nosep
}
}
}
}
On Tue, Apr 27, 2010 at 12:56 PM, Maarten buis <[email protected]>
wrote:
--- On Tue, 27/4/10, sara khan wrote:
I just tried this but the output only shows the display
results and nothing from tabstat.
<snip>
-capture- works for me:
*----------------- begin example ---------------------
sysuse auto, clear
forvalues i = 0/5 {
capture noisily tabstat mpg if rep78== `i', ///
s(n mean) by(foreign)
}
*-------------------- end example -------------------
In order to debug your loop I would build it step by step:
step 1: no looping, no locals, no -if- just a single -tatstat- command
step 2: add -capture noisily-
step 3: add some -if- conditions
step 4: build a single loop (e.g. over i but not over y)
etc. etc.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/