Ron�n is right to remind us that logical (true-or-false) conditions can be
combined, and I'll say more about this in another post, but
this example would I think not bite users hard even with a three-way logic
in which such conditions coould take on possible values of True, False,
or Missing. Thus
condition | condition | condition | ...
would surely be treated as True if at least one of the conditions
were True, and as Missing if and only if all the conditions were Missing.
Nick
[email protected]
Ronan Conroy
There's another consideration too. Logical operators are often found
in complex expressions. While sometimes you have to guard against
missing values, some expressions depend on all variable being
nonmissing, while others do not.
Consider
. gen underweight = (bmi1 < 19 ) | (bmi2 < 19) | (bmi3 < 19)
. lab var underweight "At least one body mass index below 19"
The variable thus defined can be calculated even when one or two of
the bmi variables are missing. If that's fine by you, then Stata
should not stand in your way.
The user might specify
. egen bmi_missing = rowmiss(bmi1 bmi2 bmi3)
. gen underweight = (bmi1 < 19 ) | (bmi2 < 19) | (bmi3 < 19) if
bmi_missing < 2
which would allow the expression to be evaluated if there were at
least two BMI measurements. But the choice of how many missing
measurements to tolerate has to be a scientific one.
For this reason, I think that the user is the only person who knows
under what circumstances a logical expression should evaluate to
missing. It's unfortunate that Stata, SAS and SPlus/R have different
ways of handling missing data in logical expressions, but I don't
think that switching to the S philosophy that x < NA evaluates to NA
is going to be any easier.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/