Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: re-sorting display order after -encode-
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: re-sorting display order after -encode-
Date
Thu, 13 Mar 2014 08:21:11 +0000
These examples illustrate the key point: when you sort on frequency,
-tabulate- is by definition not sorting on values. So, any poor choice
of labelling doesn't bite.
. sysuse auto
(1978 Automobile Data)
. tab rep78, sort
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
3 | 30 43.48 43.48
4 | 18 26.09 69.57
5 | 11 15.94 85.51
2 | 8 11.59 97.10
1 | 2 2.90 100.00
------------+-----------------------------------
Total | 69 100.00
. tab foreign, sort
Car type | Freq. Percent Cum.
------------+-----------------------------------
Domestic | 52 70.27 70.27
Foreign | 22 29.73 100.00
------------+-----------------------------------
Total | 74 100.00
. tab foreign, sort nola
Car type | Freq. Percent Cum.
------------+-----------------------------------
0 | 52 70.27 70.27
1 | 22 29.73 100.00
------------+-----------------------------------
Total | 74 100.00
. gen foreign2 = 1 - foreign
. tab foreign2, sort
foreign2 | Freq. Percent Cum.
------------+-----------------------------------
1 | 52 70.27 70.27
0 | 22 29.73 100.00
------------+-----------------------------------
Total | 74 100.00
Nick
[email protected]
On 13 March 2014 08:09, Nick Cox <[email protected]> wrote:
> Your code fragment makes no sense. If -varA- is string, then all those
> statements will evoke error messages as attempts to put numeric values
> in a string variable.
> Presumably you mean something else.
>
> -tab, sort- places values of a variable in order of descending
> frequency. What its string or numeric values are, or how they are
> labelled, is then irrelevant. The most frequent value is the most
> frequent value regardless of what the value is.
>
> -mrtab- is a user-written program from SJ, as you are asked to
> explain. It's an excellent program, which I have not used in a long
> while, but I imagine that looking at its help would answer your
> question.
>
> These questions don't have much bearing on the previous question. It
> remains true that no Stata command has the ability to infer the
> correct order for "Always" "Never" "Sometimes", so at some point the
> order you desire has to be made explicit to Stata.
>
> Nick
> [email protected]
>
>
> On 13 March 2014 02:23, Michael McCulloch <[email protected]> wrote:
>> Nick, this solution worked very well when the incoming strings are ordinal variables with a fixed sequence like "Always", "Sometimes" and "Never". I simply did this:
>> replace varA=1 if varA=="Always"
>> replace varA=2 if varA=="Sometimes"
>> replace varA=3 if varA=="Never",
>> and then proceeded to -encode-.
>>
>> By contrast, if for another variable the incoming strings do not have a logical sequence, but I wish to have -tab- or -mrtab- sorted by order of descending frequency, is there a way to do that without manually inspecting and updating my do-file?
>> I'm creating periodic reports for my colleagues as responses pile up, and it would be very useful to have this be achievable via algorithm rather than manual inspection.
>>
>> Best wishes,
>> Michael McCulloch
>>
>> --
>> Pine Street Foundation, since 1989
>> 124 Pine Street | San Anselmo | California | 94960-2674
>> P: (415) 407-1357 | F: (206) 338-2391 | http://www.PineStreetFoundation.org
>>
>> On Mar 10, 2014, at 1:26 AM, Nick Cox wrote:
>>
>>> You evidently just used the default produced by -encode- so that
>>> incoming strings were labelled according to their alphanumeric order.
>>> Thus values 1, 2, 3, 4, 5 correspond to strings "0-200" ...
>>> "501-1,000". -encode- has no notion of looking inside the strings to
>>> discern a meaning and thus a natural order, any more than than it can
>>> sort "average" "bad" "good" into the correct order. You need to define
>>> labels in advance before you use -encode- or fix the problem using
>>> -recode- or some equivalent.
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 10 March 2014 03:01, Michael McCulloch <[email protected]>
>>>
>>>> I have used -encode- to add value labels from string variable, which are a series of numerical ranges stored as text.
>>>>
>>>> -codebook- shows the frequency data, and Label values, were correct.
>>>> Freq. Numeric Label
>>>> 121 1 0-20
>>>> 16 2 1,001+
>>>> 36 3 101-500
>>>> 81 4 21-100
>>>> 8 5 501-1,000
>>>>
>>>> However, I wish to display them using -tab- so that the rows are sorted on the value label.
>>>> The -encode- help file does not suggest this is possible. Is there a workaround?
>>>>
>>>> What I want to achieve is -tab- showing this:
>>>> Label Freq
>>>> 0-20 121
>>>> 21-100 81
>>>> 101-500 36
>>>> 501-1,000 8
>>>> 1,001+ 16
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/