Yes, you are right. I stand corrected. It is very inefficient to use if
statement within the while loop as I am doing... The better way to use the
while loop was pointed out by Michael Blasnik which I have adopted.
----- Original Message -----
From: "Christopher Baum" <[email protected]>
To: <[email protected]>
Sent: Sunday, June 27, 2004 6:36 AM
Subject: st: by vs while
> In a recent posting Subhankar said
>
> The -by- command is so much faster than the -while- command...
>
> If I compare
>
> by month: regress returns factor
>
> vs.
>
> local i = 1
> while i <= 1000 {
> regress returns factor if `i' == month
> local i = `i' + 1
> }
>
> I find that the -by- command is atleast 15-20 times faster than
the -while-
> loop.
>
>
>
> The speed differential here has nothing to do with by vs while. The clumsy
part of your code is the if i==month. Stata must examine EACH observation in
the dataset for EVERY pass through this loop. Let us say that you know that
there are a certain number of observations per month. Then replacing the if
with an in first/last will speed this up immensely. If the number of obs per
month is constant, then this could be done with a simple counter. If the
number of obs per month varies, then it is worth it to pass through the
dataset ONCE and set up two integer sequences containing the first and last
obs for that month, and reference those in the in statement. That fix will,
I imagine, remove most of the speed differential between these two methods.
>
> Bottom line: in a large (esp. panel) dataset, never use the if
qualifier--especially when you're doing some sort of loop over chunks of the
data. It is horribly inefficient.
>
> Kit
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/