Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: suppressing low frequency observations in tabulation
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: suppressing low frequency observations in tabulation
Date
Thu, 25 Oct 2012 00:45:00 +0100
Although I retain a certain moderate affection for this program, there
are other solutions, including
. contract drugname
. sort _freq
. keep in -25/L
. tab drugname [fw=_freq] , sort
On Wed, Oct 24, 2012 at 11:49 PM, Nick Cox <[email protected]> wrote:
> This problem is addressed by the user-written program -modes-
> originally published in STB-50 in 1999:
>
> STB-50 sg113 . . . . . . . . . . . . . . . . . . . . . . Tabulation of modes
> (help modes if installed) . . . . . . . . . . . . . . . . . N. J. Cox
> 7/99 pp.26--27; STB Reprints Vol 9, pp.180--181
> provides table of most frequent observations (modes)
>
> The software was updated in Stata Journal 3(2) (2003) and 9(4) (2009)
> so that the most recent version can be installed after typing
>
> . net describe sg113_2, from(http://www.stata-journal.com/software/sj9-4)
>
> Nick
>
> On Wed, Oct 24, 2012 at 11:08 PM, Kevin McConeghy
> <[email protected]> wrote:
>
>> I have a large dataset, roughly 6.5mill obs, which is the FDA adverse
>> event database. Variable drugname is the string describing the drug.
>>
>> . describe drugname
>>
>> storage display value
>> variable name type format label variable label
>> ---------------------------------------------------------------------------------------------------------------------------------------------------
>> drugname str30 %30s
>>
>> I want to create a frequency table of the top 25 drug "offenders" in
>> the database, however I am having trouble figuring out how to get
>> Stata to perform the tab drugname command without including all the
>> low frequency observations from random drugs (which causes stata to
>> stop the command becuase "too many values"). I can't see an option for
>> this in the syntax. Any advice on how to filter out all the background
>> noise for this?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/