After the flurry of crossing posts on this topic, finally put to bed
by Bill Gould's very clear reply, perhaps it is worth airing the
question of how cut SHOULD behave.
In the original version, the result of tabulating newvar after
egen newvar=cut(oldvar),at(25(5)45)
was
25-
30-
35-
40-
and as I understand it, the complaint was that the numbers
25,30,35,40,45 are described as left hand end-points so that strictly
the output of tabulate should be
25-
30-
35-
40-
45-
in which the last group contains all non-missing values of var. I
confess that I don't like this, as I would have to exclude 45- from
all following work with newvar. Also, why is there not a <25 group,
which you might expect if there is a >45 group?
Perhaps (as I think Jens suggested) the output
25-
30-
35-
40-45
would satisfy all parties. Only observations in [25,45) are included
and 45 is a not-included right-hand end. All observations outside
[25,45) are coded as missing on newvar.
The output
[25-30)
[30-35)
[35-40)
[40-45)
would be even better, but the mathematician's convention that [25-30)
includes 25 but not 30 is not recognized in medicine and probably
not in economics either.
--
Michael Hills
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/