Nick Cox wrote:
Well, when I look at the documentation for these two,
neither mentions anything about returning sorted values.
Not so.
-egen-
======
See [R] egen p.325:
"The order of the groups is that of the sort order of varlist."
An example follows.
Point conceded, though reluctantly - sticking that into an Example
(not preceding an example as you suggest) is a bit subtle; perhaps
I'm naive to expect functionality to be fully specified in the
Description?
-levels-
========
See help for -levels- (my emphasis):
Sorry, here I was sloppy. I discounted -levels- because I
didn't think it applied to my non-categorical data, not because
it doesn't sort (the documentation is very clear there).
Exactly. The -sort- gets things into the right order
for calculation. Nothing in the following code you quote
undoes the -sort-.
However, -egen- as a whole is -sortpreserve-, i.e. after -egen-
has finished it restores the sort order when it started.
If you want the data -sort-ed, then you must do that explicitly.
This is part of the same Stata principle: only a command which
is designed to -sort- the data will change the sort order of your
data.
(-sort- is the obvious example, but -tsset- is another.)
Yes, I know about the Stata principle, and that -egen- and
-gen- respect it. But recall that my coding question/uncertainty was:
is
by `varlist' [,sort] : gen var1 = exp
"a command which is designed to -sort- the data"? Intuitively the
answer is no, but I'm not embarrassed about being unsure whether a
command which *requires* either pre-sorting or a [sort] option changes
the sort order. For instance, why does -by- insist on the user supplying
the sort? "Surely", one can't but think, because you can't do a -by-
without changing the sort order; otherwise, -by- would just do
a -sortpreserve-, do it's job, and restore the original sort order.
It's always seemed a bit of a Stata anomaly.
It seems to me that all these doubts are answerable by reference
to the documentation or by understanding what the code, which is
open, actually does.
I agree. It also seems to me that the documentation could be more
explicit about both -egen-'s -rank- and -group- functions in their
formal Descriptions. As for understanding what the code does, your
contributions have, as always, been very helpful.
cheers,
Jeph
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/