Nick Cox wrote:
>For consistency with what?
For consistency (in the context of data management) between the way in which
Stata treats missing values when handling them as continuous variables and
the way in which Stata treats missing values when handling them as
categorical variables. When handling data as continuous, Stata treats
missing as highest-possible, but when handling as categorical, Stata treats
missing as missing. A quick example: -tabulate , generate()-
and -tabulate, missing generate()- give similar results in that values in
dummy variables for records with missing-value categories (. through .z) are
set to missing (.).
Now that Nick mentions it, -tabulate-'s case is not so clear-cut in that it
does serve both data management and statistical purposes, which probably
explains its choice of default behavior. And the help file for -tabulate-
is explicit as to what its -missing- option does. But perhaps it would have
been better if -tabulate, missing generate()- behaved more like -egen =
group(), missing-, since the context is clearly data management.
Nick's other points are well taken, too, and I wasn't trying to hold SAS up
as an exemplar--Stata's choices for default behavior are agreeable.
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/