I've got to say that while Neil has a good logical case, I also regard
this as a reasonable default behaviour.
My guess is that usually when a -collapse- or -contract- asks for
combinations that don't exist, it is the result of a typo (user didn't
mean to type what was issued) or of a misconception (user thought there
were some such observations). So the commands in question issue error
2000 and leave your dataset as is, which is usually what you should want
to happen. I don't think an option to allow results summarising what is
what for non-existent observations (zero count or missing summary
statistics) would be used enough to justify hitting the code, but that's
a personal view and they are naturally official commands. There are
work-arounds too.
Nick
[email protected]
Neil Shephard
I've encountered some unexpected behaviour whilst using -collapse- on
different subsets of data. An example to demonstrate the problem....
sysuse auto, clear
sum price
collapse (count) n = mpg if(price < 2000)
This results in an error as there are no cars with price < 2000...
. sysuse auto, clear
(1978 Automobile Data)
. sum price
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
price | 74 6165.257 2949.496 3291 15906
. collapse (count) n = mpg if price < 2000
no observations
r(2000);
I would have expected the command to run and return a count of zero
(0) as this is a valid number of counts (to my mind at least).
There's a mention at http://www.stata.com/help.cgi?whatsnew10 that
indicates that trying to open the data editor with an if qualifier
that resulted in zero observations used to cause Stata to crash, but
that has been fixed.
Searching the archives/FAQ I haven't been able to find whether this
has has been discussed before, or is a reasonable behaviour (as I say
I would have expected to get a count of zero returned).
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/