Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: statalist-digest V4 #4961: Re: st: Value labels won't show on box plot axis
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: statalist-digest V4 #4961: Re: st: Value labels won't show on box plot axis
Date
Fri, 2 Aug 2013 12:52:24 +0100
Whether you can (validly) take means of ordinal scales (if that is
what we have here) is arguably not to the point. Boxplots show
medians, quartiles, etc. and not means. There's a judgement call on
quite how _useful_ those measures are with ordinal scales but they are
well defined.
I would usually prefer different plots myself for ordinal scales but
(for once) was holding off on general pontification.
Nick
[email protected]
On 2 August 2013 12:14, Allan Reese (Cefas) <[email protected]> wrote:
> On 1 August 2013 10:46, Walsh, Lee <[email protected]> wrote:
>>> I am box plotting a variable grouped by another.
>>> I am telling stata to use the value labels for the y axis
>>> but it insists on showing the numerical values.
>>>
>>> graph box response, over(statementNum) ylabel(1(1)7, valuelabel)
>
> On 1 Aug 2013 11:00:00 +0100, Nick Cox <[email protected]> agreed
>> This didn't work either in my experiments. Here's a replicable example
>> and a work-around:
>
>> . sysuse auto
>> . graph box foreign, over(rep78) yla(0 1, valuelabel)
>
> [and built a list of re-labels in a macro named -call-]
> ...
>> forval i = 1/7 {
>> local call `call' `i' "`: label (response) `i''"
>> }
> [which was then used in the -graph- command]
> . graph box response, over(statementNum) yla(`call')
> -------------------------------------------------------
>
> That's an elegant and general piece of programming, but ignores the semantic question of why you would use a boxplot on a non-metric variable. I take it that Lee's responses are a 7-point opinion scale. There are differing views on whether it is valid to compare means of scales, given the numeric codes are arbitrary beyond the ordering. Extracting the five-number summary and making a boxplot seems very doubtful to me. Nick's cars example plot shows the effect of applying boxplots to a discrete variable with few values and multiple coincidences.
>
> It may be quicker and less baffling for small numbers of re-labels to write the code explicitly. I've recently had numerous graphs where a particular value represents the "limit of detection". That is certainly worth drawing attention to in the graph rather than just in the caption. I also prefer natural labels when plotting a logged number. Bacterial counts usually follow a lognormal distribution.
>
> graph box logbact, over(groupvar) ylab(1.3 "20=LoD" 2 "100" 3 "1000" ...)
>
> Allan
>
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/