So many trying to get the last word on a problem that won't go away.
I read Bill Gould's comments as a bit softer than they used to be, but I
may be mistaken. Bill writes: "... the problem of missing values and the
number line are inherent"
Yes, but does that make it necessary to
- let (x) evaluate to true if x is missing:
. gen y=1 if x
- let (x>100) evaluate to true if x is missing:
. gen y=1 if x>100
Obviously, some decision was needed. The decision made is perfectly
logical, but the following alternative is equally logical and much more
in line with the expectations of ordinary users:
- let (x) evaluate to false if x is missing
- let (x>100) evaluate to false if x is missing
- let (x==100) evaluate to false if x is missing
- let (x<100) evaluate to false if x is missing
- let (x==.) evaluate to true if x is . (missing)
Nick Cox asks: "What do you consider appropriate Stata behaviour for
. list x if x > 42
. regress z y if x > 42"
This is easy: I consider it appropriate to omit observations with x
missing in both situations.
I do not care about the internal value of missings (this is why I bought
a statistical package program), and I see no problem in the way -sort-
handles them.
Bill Gould: "The observations containing missing values need to be easy
to identify and classify". Don't the missing() function and egen's
rowmiss() and rownonmiss() do that perfectly?
Last word? Hardly.
Svend
__________________________________________
Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000 Aarhus C, Denmark
Phone: +45 8942 6090
Mobile: +45 2634 7796
Email: [email protected]
__________________________________________
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/