Dear All,
in a different thread Dan Blanchette asked about cooperation of -in-
and -if-. I have asked myself a slightly different question whether
specifying if-conditions can always substitute for in-conditions: e.g.
instead of "in #A/#B" one can type "if inrange(_n,#A,#B)".
There seems to be a bug in -use- that get's confused by such a
condition. My colleague has suggested that this might happen because
Stata will qualify _n according to the current dataset in memory, but
qualify if- for the dataset during the load. I was able to come up
with an example where it get's confused unconditionally on the current
dataset. It seems that the conditon "larger" is not evaluated properly
in this case.
*** bug with use ... if F(_n)
*** N(auto.dta)=74
sysuse auto, clear
local fullauto `r(fn)'
use `"`fullauto'"' in 1/37, clear
count
assert (_N==37)
use `"`fullauto'"' in 38/74, clear
count
assert (_N==37)
use `"`fullauto'"' if _n<=37, clear
count
assert (_N==37)
use `"`fullauto'"' if _n>37, clear
count
assert (_N==37)
It is hard to understand what Stata will think of _n while loading
data, but it is definitely not the observation number.
Strangely the condition inrange(_n,1,20) loads 20 (twenty)
observations, but inrange(_n,2,20) loads 0 (zero).
So if you ever try to work with large datasets in smaller portions,
slice them with an in-condition, not an if-condition!
Stata MP for Windows, v10.1.551 born 02 Feb 2009, (currently latest.
This recent update brings some very welcomed changes: thank you!)
Best regards, Sergiy Radyakin
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/