When David Kantor and I wrote a tutorial on -cond()- in
SJ-5-3 pr0016 . . Depending on conditions: a tutorial on the cond()
function
. . . . . . . . . . . . . . . . . . . . . . . D. Kantor and N.
J. Cox
Q3/05 SJ 5(3):413--420 (no
commands)
tutorial on the cond() function
we didn't even get to the four argument case. I agree with Nick Winter
that the on-line
help for the four argument case looks wrong.
I think what Paul wants is well (if not best) coded like this:
gen z = cond(missing(x), ., x > 5)
This has all of a sudden come to be my favourite way to code creation of
dummy/dichotomous/binary/logical/quantal/Boolean variables that could
be 1, 0 or missing. (Any other synonyms?)
It's perhaps simpler at first sight to code like this
gen z = x > 5 if x < .
in which the mapping of missings to missings is tacit, that is, if x is
missing Stata does not use
the result of (x > 5) but assigns missings.
But then if you have some more complicated definition involving two or
more variables you have
to trap all the problems on all the variables:
gen z = x > y if x < . & y < .
This could be
gen z = x > y if !missing(x, y)
but as said I like to turn it round
gen z = cond(missing(x, y), ., (x > y))
That way it's explicit what happens with missings. And it's quite easy
to put in words:
If there are missings on any x or y, return missing; otherwise evaluate
(x > y).
Yet more variables can be packed into the -missing()-:
gen z = cond(missing(x, y, a, b), ., (x > y) & (a == b))
In all the above, -gen byte z- rather than -gen z- is careful on
storage.
Nick
[email protected]
Nick Winter
It looks to me like the examples in the help for cond() are either
incorrect or misleading.
The function cond(condition,a,b,c)
returns -a- if -condition- is true; -b- if -condition- is false, and -c-
if -condition- is missing.
Note that last is "the *condition* is missing"; that is, that the
statement evaluates to missing. This is *not* the same as some part of
-condition- evaluating to missing.
So in the example where condition is "x>2", this condition evaluates to
either true or false for all observations, including cases where x=.,
because the condition ".>2" is true under Stata's handling of missing
values.
This seems to make the following statement from the help file wrong:
"cond(a>2,"this","that","missing") = "missing" if a > ."
The only way I can think of to trigger the "missing" option would be
something like this:
clear
set obs 10
gen x=_n-1 in 1/8
gen z=cond(x,"true","false","missing")
list
+-------------+
| x z |
|-------------|
1. | 0 false |
2. | 1 true |
3. | 2 true |
4. | 3 true |
5. | 4 true |
|-------------|
6. | 5 true |
7. | 6 true |
8. | 7 true |
9. | . missing |
10. | . missing |
+-------------+
But once you are doing a comparison (x>2), that will always evaluate to
either "true" or "false" in Stata; never to missing.
Visintainer, Paul
> I'm not sure why the "condition" function is not coding z with 2
missing
> values. If I'm reading the functions command correctly, z should be
> coded as missing:
>
> cond(a>2,"this","that","missing") = "missing" if a > .
> cond(a>2,"this","that","missing") = "this" if a > 2 and a < .
>
> Any ideas?
>
> Thanks.
>
> . gen z=cond(x>5,1,0,.)
>
> . list
>
> +-------+
> | x z |
> |-------|
> 1. | 1 0 |
> 2. | 2 0 |
> 3. | 3 0 |
> 4. | 4 0 |
> 5. | 5 0 |
> |-------|
> 6. | 6 1 |
> 7. | 7 1 |
> 8. | 8 1 |
> 9. | . 1 |
> 10. | . 1 |
> +-------+
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/