Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: RE: Sort Two-Way Tab
From
"Martin Weiss" <[email protected]>
To
<[email protected]>
Subject
st: RE: RE: RE: Sort Two-Way Tab
Date
Wed, 26 May 2010 18:54:07 +0200
<>
To have Stata restrict the output to the five most frequently occuring
values of mpg:
***********
sysuse auto, clear
bys mpg: gen int negcount=-_N
egen mygroups=group(negcount mpg)
tabsort mpg rep78 if mygroups<=5, nocsort
***********
How to deal with the fact that "22" and "25" are tied for fifth place in
terms of frequency for "mpg", I am not so sure...
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Mittwoch, 26. Mai 2010 18:43
To: [email protected]
Subject: st: RE: RE: Sort Two-Way Tab
-fre- (SSC) is an excellent program, but it is for one-way tables only.
This need is met by -tabsort- from -tab_chi- on SSC.
. tabsort mpg rep78, nocsort
Mileage | Repair Record 1978
(mpg) | 1 2 3 4 5 |
Total
-----------+-------------------------------------------------------+----
------
18 | 1 2 2 2 2 |
9
19 | 0 0 7 0 0 |
7
14 | 0 1 2 2 0 |
5
21 | 0 0 3 2 0 |
5
25 | 0 0 1 3 1 |
5
16 | 0 1 2 1 0 |
4
17 | 0 1 1 0 2 |
4
22 | 0 1 2 1 0 |
4
24 | 1 2 0 1 0 |
4
20 | 0 0 3 0 0 |
3
23 | 0 0 1 2 0 |
3
28 | 0 0 1 2 0 |
3
12 | 0 0 2 0 0 |
2
15 | 0 0 1 1 0 |
2
30 | 0 0 0 1 1 |
2
35 | 0 0 0 0 2 |
2
26 | 0 0 1 0 0 |
1
29 | 0 0 1 0 0 |
1
31 | 0 0 0 0 1 |
1
34 | 0 0 0 0 1 |
1
41 | 0 0 0 0 1 |
1
-----------+-------------------------------------------------------+----
------
Total | 2 8 30 18 11 |
69
Note that
. findit table sort
would have pointed to -tab_chi-. That is, you missed this because you
didn't search SSC via -findit-. -tabsort- requires Stata 8.
Nick
[email protected]
Martin Weiss
"I have searched specifically through -tab1-, -tab2-, -tabstat-,
-table-,
and -estout-, with no clear resolution."
Suspiciously absent from your list is -ssc d fre- by Ben Jann...
Kyle Longest
I am using Stata SE 11.0. I have searched through the manuals, help,
and statalist and not found a direct solution to my issue, but
apologize in advance if there is a straightforward resolution.
Problem: I would like to create a cross-tabulation of two variables
that is displayed in the sorted (by frequency) of the row variable. I
have searched specifically through -tab1-, -tab2-, -tabstat-, -table-,
and -estout-, with no clear resolution.
Description and Example: using auto.dta.
Say I was interested in the relationship between miles per gallon
(mpg) and a car's repair record (rep78). There are numerous possible
mpg's so I decide to only be concerned with the 5 most common mpg's.
[I'd like to set the analytic validity/utility aside for a moment, as
in the applied problem the desired technique is more warranted, for
various reasons, than it may be in this example.] It is easy to get a
frequency distribution of mpg in this sorted order using tab with the
sort option:
. tab mpg, sort
Mileage |
(mpg) | Freq. Percent Cum.
------------+-----------------------------------
18 | 9 12.16 12.16
19 | 8 10.81 22.97
14 | 6 8.11 31.08
21 | 5 6.76 37.84
22 | 5 6.76 44.59
25 | 5 6.76 51.35
[omitted]
But now I would like to have this same distribution (order)
cross-tabulated by the levels of rep78 (across the columns). As far as
I can tell there is no way to do this with a current command. -tab-
and its derivatives do not allow the sort option. For example, any
call to tab will produce the following, which abandons the frequency
sort order of mpg:
. tab mpg rep78
Mileage | Repair Record 1978
(mpg) | 1 2 3 4 5 | Total
-----------+--------------------------------+----------
12 | 0 0 2 0 0 | 2
14 | 0 1 2 2 0 | 5
15 | 0 0 1 1 0 | 2
16 | 0 1 2 1 0 | 4
17 | 0 1 1 0 2 | 4
18 | 1 2 2 2 2 | 9
19 | 0 0 7 0 0 | 7
20 | 0 0 3 0 0 | 3
21 | 0 0 3 2 0 | 5
[ommitted]
What I am hoping is that there is a command or method to produce the
following (along with the row/column %s which are omitted for
brevity):
. command mpg rep78, option
Mileage | Repair Record 1978
(mpg) | 1 2 3 4 5 | Total
-----------+---------------------------------------+----------
18 | 1 2 2 2 2 | 9
19 | 0 0 7 0 0 | 7
14 | 0 1 2 2 0 | 5
21 | 0 0 3 2 0 | 5
[ommitted]
I have considered some manual options (e.g., using the gen option with
-tab- and then compiling a matrix with the appropriate frequencies
and percentages) but was hoping there may be a more straightforward
solution. [I also realize that this is primarily a convenience issue
as I could feasibly copy and paste the given cross tab into the
desired order but this task can be extremely time consuming if there
sort variable has many categories].
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/