Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: suppressing low frequency observations in tabulation |
Date | Wed, 24 Oct 2012 23:49:09 +0100 |
This problem is addressed by the user-written program -modes- originally published in STB-50 in 1999: STB-50 sg113 . . . . . . . . . . . . . . . . . . . . . . Tabulation of modes (help modes if installed) . . . . . . . . . . . . . . . . . N. J. Cox 7/99 pp.26--27; STB Reprints Vol 9, pp.180--181 provides table of most frequent observations (modes) The software was updated in Stata Journal 3(2) (2003) and 9(4) (2009) so that the most recent version can be installed after typing . net describe sg113_2, from(http://www.stata-journal.com/software/sj9-4) Nick On Wed, Oct 24, 2012 at 11:08 PM, Kevin McConeghy <kevinmcconeghy@gmail.com> wrote: > I have a large dataset, roughly 6.5mill obs, which is the FDA adverse > event database. Variable drugname is the string describing the drug. > > . describe drugname > > storage display value > variable name type format label variable label > --------------------------------------------------------------------------------------------------------------------------------------------------- > drugname str30 %30s > > I want to create a frequency table of the top 25 drug "offenders" in > the database, however I am having trouble figuring out how to get > Stata to perform the tab drugname command without including all the > low frequency observations from random drugs (which causes stata to > stop the command becuase "too many values"). I can't see an option for > this in the syntax. Any advice on how to filter out all the background > noise for this? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/