Rodrigo Briceno
> Hi, I need some information: I'm using tabsort in order to obtain
> the most frequently diagnoses in a hospital. I want that Stata
> only presents me the first 10 diagnoses, and I think that I can
> do that by typing
>
> Tabsort clave1 in 1/10, but Stata showed me another thing:
>
> tabsort clave1 in 1/10
>
> clave1 | Freq. Percent Cum.
> ------------+-----------------------------------
> O809 | 10 100.00 100.00
> ------------+-----------------------------------
> Total | 10 100.00
>
> Can somebody explain me what is the correct procedure to obtain
> what I want?
> It is possible to assign to this ten codes the description of the
> CIE-10 that correspond by making some formula in Stata?
-tabsort- is a user-written command downloadable
as part of the -tab_chi- package from SSC.
Incidentally, -tabsort- really is an awful kludge.
What it does is oblige -tabulate- to produce results
once, quietly; it then works on those results and
finally gets -tabulate- to emit them once more in sorted form.
The approach I discussed yesterday on Statalist is, I believe,
often much better.
First, to explain what Rodrigo got. -tabsort- is here
working in a standard Stata way, namely
in 1/10
selects the first 10 observations and -tabsort-
then shows a table for those. Seemingly, in Rodrigo's case
they are all the same.
I think he wants -tabsort- to produce a table only
for the 10 most common entries, which is a different
problem, and one not soluble with -tabsort- alone.
That is, Rodrigo wants to select -in 1/10- _within_
his table, but that is not how -in- works.
I don't understand what CIE-10 means.
Translating to an auto data problem: let's
define
egen manuf = head(make)
and say we want a table of the ten most common manufacturers.
First we compute frequencies directly:
. bysort manuf : gen freq = _N
Each frequency will appear repeatedly for each manuf
represented more than once, whereas we only want to
see each frequency once. One way is to tag just
one observation in each group:
. egen tag = tag(manuf)
-egen- haters would prefer
. by manuf : gen tag = _n == 1
Now we sort first on selected observations and then
on (negated) frequencies:
. gsort - tag - freq
That way, what we want is at the start of the data
set. Now generate a rank order variable
. gen order = _n
and produce our own table directly:
. tabdisp order in 1/10 if tag, c(manuf freq)
----------------------------------
order | manuf freq
----------+-----------------------
1 | Olds 7
2 | Buick 7
3 | Chev. 6
4 | Pont. 6
5 | Merc. 6
6 | Plym. 5
7 | Datsun 4
8 | Dodge 4
9 | VW 4
10 | AMC 3
----------------------------------
Let's hope Rodrigo's table is more interesting.
His code is similar, if I'm understanding properly:
bysort clave1 : gen freq = _N
egen tag = tag(clave1)
gsort - tag - freq
gen order = _n
tabdisp order in 1/10 if tag, c(clave1 freq)
Nick
[email protected]
<<attachment: winmail.dat>>