Nick Winter replied to Alan Feiveson
> >
> > I often just type "list in 101/123" or some such other
> > numbers with
> > limited range - even though I may have 50000 observations. I
> > find this less
> > bothersome than going to the editor and searching for
> > observations 101/123.
> > Will list8 also be slow if you give it a limited range even
> > though there are
> > 10000's of observations?
>
> I just tried this out, and the delay before the list begins
> seems to be
> related to the number of observations to be listed. Thus,
>
> . list in 50000/50010
>
> takes 0.01 seconds in my test database, whereas
>
> . list
>
> takes over 4 seconds to begin.
>
> The following takes about 0.17 seconds:
>
> . list if inrange(_n,20000,20035)
>
> campared with:
>
> . generage rand=uniform()
> . list if rand>.9999
>
> which takes about 0.18 seconds to list in my sample dataset, and
> produces 35 observations. So there appears to be no penalty (beyond
> that associated with -if-) for non-contiguous
> observations...the delay
> in listing is purely a function of the number of observations to be
> listed. This makes sense, given the cause of the delay, as
> explained earlier on this thread.
This points up a general principle, explored
in some detail by Michael Blasnik a while
ago, that
... in 101/123
is typically (?always) going to be much faster
than the equivalent with -if-, so -- if you have a
choice -- go for -in-. Forget your human intelligence which
tells you that
... if inrange(_n, 101, 123)
is obviously false outside the the range
specified. Stata, I believe, has no sense
of "obvious" in this case.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/