Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Dataset of means from the three largest values of a group
Nick Cox <[email protected]>
[email protected]
Re: st: Dataset of means from the three largest values of a group
Mon, 28 Nov 2011 20:39:57 +0000
Someone might want ideas on how to handle the missings assumed
previously not to exist.
Here's one way:
gen ismissing = missing(trade)
bysort ismissing country year (trade) : gen tag = (_N - _n) < 3
by ismissing country year : egen meanhighest = mean(trade / tag)
bysort country year (meanhighest) : replace meanhighest = meanhighest[1]
drop ismissing
On Mon, Nov 28, 2011 at 8:25 PM, Nick Cox <[email protected]> wrote:
> I assume no missing values for -trade-. "3" here evidently means here "up to 3"
> bysort country year (trade) : gen tag = (_N - _n) < 3
> by country year : egen meanhighest = mean(trade / tag)
> On why division by zero can be useful, see
> SJ-11-2 dm0055 . . . . . . . . . . . . . . Speaking Stata: Compared with ...
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
> Q2/11 SJ 11(2):305--314 (no commands)
> reviews techniques for relating values to values in other
> observations
> The second highest (silver medallist) is more robust-resistant to outliers:
> bysort country year (trade) : gen silver = trade[_N-1]
> Nick
> On Mon, Nov 28, 2011 at 8:03 PM, Iulian Ihnatov <[email protected]> wrote:
>> I have the following dataset for the period of 1999 to 2010:
>> country year partner trade
>> AFG 1999 USA 12345
>> AFG 1999 DEU 9875
>> AFG 1999 FRA 25487
>> ........................
>> AFG 2000 USA 5454
>> AFG 2000 HUN 5454
>> ........................
>> HUN 1999 DEU 58744
>> ........................
>> I need to create a dataset of means of the "trade" variable, grouped by
>> country and year, but only for the three largest observations of each group.
>> I may use - collapse (mean) trade, by(country year) -, but I don't know how
>> to isolate the largest three values from each group (in some years, there
>> are only 1 or 2 observations available, in others more than 10). Any help
>> would be highly appreciated.
* For searches and help try: