From: "Svend Juul" <[email protected]>
Louis wrote:
I'm not clear about exactly what -missing- does. An example will clarify
my concern. I have a dataset containing 11 variables (the first is -clust-
and the last is -s11aq13-). One of the variables is -trexpcd-, and the
total number of observations is 4872. I gave the following commands and
had the shown output:
count if mi(trexpcd)
4649
count if mi(clust-s11aq13)
82
My understanding of the online -help- is that -missing- evaluates the
number of observations for which any of the arguments is missing. So, for
the second command, since -trexpcd- is one of the arguments, I expected
the result to be a number which is at least equal to 4649.
--------------------------------------------------------------
hmm! It looks a bit strange. I tried this:
. sysuse auto
(1978 Automobile Data)
. count if missing(rep78)
5
. count if missing(mpg, rep78, headroom)
5
. count if missing(rep78-headroom)
5
. count if missing(mpg-headroom)
0
. count if missing(mpg-rep78)
5
According to the documentation -missing()- takes a list of arguments,
separated by commas. The arguments may be variable names, but not in
a variable list form, so mpg-headroom ought to be illegal. Apparently
missing() accepted mpg-headroom, but examined only mpg and headroom,
not the intervening rep78. I checked it both on Stata 8.2 and 9.1; both
versions have the same behavior.
-egen-s rmiss() function (rowmiss() in version 9) takes a variable list
as arguments, and that may be what you need.
Confused? I am a bit myself.
Svend