The collapse approach is different in a couple of ways. First, if there are
other variables in the dataset, then the collapse approach will omit them
while the by approach will include them. Second, the by approach will
select missing values on var1 as the max, so if there are missing values
they should probably be dropped before executing the command (assuming you
don't want missing to mean max). Third, the by approach wil be faster,
although the speed difference would probably only be noticeable in very
large datasets.
Michael Blasnik
----- Original Message -----
From: "Rodrigo A. Alfaro" <[email protected]>
To: <[email protected]>
Sent: Sunday, September 24, 2006 10:42 PM
Subject: st: Re: Re: how to keep maximum value
> An alternative
>
> collapse (max) var1, by(year)
>
> ----- Original Message -----
> From: "Michael Blasnik" <[email protected]>
> To: <[email protected]>
> Sent: Sunday, September 24, 2006 10:01 PM
> Subject: st: Re: how to keep maximum value
>
>
>
> bysort year (var1): keep if _n==_N
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/