I don't use -gsort- much, as I usually prefer to work out
my own -sort- order without wanting to re-discover
the precise idiosyncratic syntax of -gsort-. (I've
got a blind spot on -recode- for the same kind of reason.)
(That's not on a par with B*ll G???d, who
can write the equivalent of an -egen- function
several times faster than it takes to find out
whether that function already exists.)
But -- to the point -- while what Brian says is a fair
answer it seems to me to point to a missing option on -gsort-.
-reallydowantmissingfirst- would not be very Stataish
as a name, but Fred Wolfe's want and need seemed very reasonable
to me.
Nick
[email protected]
Brian P. Poi
> On Thu Jul 5 06:58:30 2007, Fred Wolfe wrote:
>
> > Is there a problem with gsort (Stata 10 and below) or am I
> > misunderstanding something?
> >
> > I have a variable called -phdif-. I want the greatest value of that
> > variable to appear in the last observation. There are
> missing values,
> > so I use -gsort- with the -mfirst- option.
>
> ...
>
> > . gsort phdif
> > . l phdif in 1,clean
> >
> > phdif
> > 1. 1
> >
> > . l phdif in l,clean
> >
> > phdif
> > 169914. .
> >
> > The problem appears to be that missings are still last even
> though I
> > used the -mfirst- option.
> >
> > Any suggestions? Is this a problem or am I thinking about this
> > incorrectly?
>
>
> The "mfirst" option of -gsort- applies only to variables sorted in
> descending order.
>
> Stata stores missing values as extremely large numbers, so if
> a variable
> is sorted in descending order, missing values should appear
> first in the
> list since they are greater than all non-missing values.
>
> -gsort-, however, tries to be helpful when sorting in
> descending order by
> putting the missing values at the end of the list, assuming
> that the user
> really cares about the large real values of the variable, not
> the missing
> values.
>
> The "mfirst" option tells -gsort- to put the missing values
> first in the
> list instead of trying to be helpful by putting them at the
> end of the
> list.
>
> If you want to get the missing values to appear first when doing an
> ascending sort, one way to proceed is to create a 0/1
> variable equal to 0
> if the variable of interest contains missing and 1 otherwise
> and then sort
> by the indicator variable and the variable of interest:
>
> . sysuse auto
> . generate missrep78 = cond(missing(rep78), 0, 1)
> . gsort missrep78 rep78
> . list rep78 in 1/7, sep(0)
> +-------+
> | rep78 |
> |-------|
> 1. | . |
> 2. | . |
> 3. | . |
> 4. | . |
> 5. | . |
> 6. | 1 |
> 7. | 1 |
> +-------+
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/