Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: tab most frequently occurring
From
Richard Goldstein <[email protected]>
To
[email protected]
Subject
Re: st: tab most frequently occurring
Date
Wed, 17 Mar 2010 12:09:55 -0400
thanks to Nick, Maarten and Scott; using these hints and code, I came up
with the following code which is called with two arguments (number of
rows wanted and variable name); note that the tab command gives exactly
what my client wants and so I have not bothered to generalize it:
qui clonevar `2'_2 = `2'
qui modes `2'_2, nmodes(`1') gen(flag`1')
qui replace `2'_2="all other" if flag`1'==0 & `2'!=""
ta `2'_2, sort mi
drop `2'_2 flag`1'
Rich
Nick Cox wrote:
> Thanks for the mention. -groups- is on SSC and was discussed in SJ 3-4.
>
> SJ-3-4 pr0011 . . . . . . . . Speaking Stata: Problems with tables,
> Part II
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.
> J. Cox
> Q4/03 SJ 3(4):420--439 (no
> commands)
> reviews three user-written commands (tabcount, makematrix,
> and groups) as different approaches to tabulation problems
>
> See also -modes-.
>
> SJ-9-4 sg113_2 . . . . . . . . . . . . . . . . . . . . . Tabulation of
> modes
> (help modes if installed) . . . . . . . . . . . . . . . . . N.
> J. Cox
> Q4/09 SJ 9(4):652
> update to allow the generate() option to record in an
> indicator variable of which observations contain values
> matching any of the modes displayed
>
> SJ-3-2 sg113_1 . . . . . . . . . . . . . . . . . . Software update for
> modes
> (help modes if installed) . . . . . . . . . . . . . . . . . N.
> J. Cox
> Q2/03 SJ 3(2):211
> provides new option for specifying the number of modes to
> be shown
>
> STB-50 sg113 . . . . . . . . . . . . . . . . . . . . . . Tabulation of
> modes
> (help modes if installed) . . . . . . . . . . . . . . . . . N.
> J. Cox
> 7/99 pp.26--27; STB Reprints Vol 9, pp.180--181
> provides table of most frequent observations (modes)
>
> However, to expand Scott's comment: Neither offers solutions to (b) or
> (c). Maarten has given some code. A hybrid of his code and -modes- would
> give you (b) and (c) as well.
>
> Nick
> [email protected]
>
> Scott Merryman
>
> Nick Cox's -groups- can handle (a):
>
> . sysuse auto, clear
> (1978 Automobile Data)
>
> . groups gear, select(5) order(h)
>
> +------------------------------------+
> | gear_r~o Freq. Percent Cum. |
> |------------------------------------|
> | 2.73 9 12.16 12.16 |
> | 2.93 8 10.81 22.97 |
> | 3.08 7 9.46 32.43 |
> | 2.47 5 6.76 39.19 |
> | 2.41 3 4.05 43.24 |
> +------------------------------------+
>
> . groups gear, select(freq>=3) order(h)
>
> +------------------------------------+
> | gear_r~o Freq. Percent Cum. |
> |------------------------------------|
> | 2.73 9 12.16 12.16 |
> | 2.93 8 10.81 22.97 |
> | 3.08 7 9.46 32.43 |
> | 2.47 5 6.76 39.19 |
> | 2.41 3 4.05 43.24 |
> |------------------------------------|
> | 3.05 3 4.05 47.30 |
> | 3.54 3 4.05 51.35 |
> | 3.78 3 4.05 55.41 |
> +------------------------------------+
>
> On Wed, Mar 17, 2010 at 8:30 AM, Richard Goldstein
> <[email protected]> wrote:
>> I want to -tabulate- a variable with many (hundreds if not thousands
> of)
>> different values; but, I only want to see (a) the 20 (say) most
>> frequently occurring values and then (b) I want a row for "all others"
>> and then (c) I want a grand total row
>>
>> I have searched in various ways for already existing programs but, of
>> course, I may have missed something (as far as I can see, -fre- will
> not
>> do what I want but I would be happy to be shown that I was wrong)
>>
>> so, two questions:
>>
>> does anyone know of an already existing program for this?
>>
>> hints, etc. for writing my own would be welcome also if anyone has any
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/