A way of indicating group sizes in the axis labels of a
box plot.
. tab rep78
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
1 | 2 2.90 2.90
2 | 8 11.59 14.49
3 | 30 43.48 57.97
4 | 18 26.09 84.06
5 | 11 15.94 100.00
------------+-----------------------------------
Total | 69 100.00
. * what follows is all one line
. graph box mpg, over(rep78,
relabel(1 `" "1" "(2)" "'
2 `" "2" "(8)" "'
3 `" "3" "(30)" "'
4 `" "4" "(18)" "'
5 `" "5" "(11)" "'))
This could be automated.
Nick
[email protected]
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Nick Cox
> Sent: 23 August 2005 18:53
> To: [email protected]
> Subject: st: RE: vwidth in old boxplot graphics
>
>
> It is still available in Stata as -graph7, box vwidth-.
>
> Programming this yourself in Stata 8 or 9 would be possible,
> I guess, but not quite trivial.
>
> In my view, wanting something like this touches on a key
> limitation of box plots: they often leave out far too much
> detail. Box plots, I suggest, are optimal when the number of
> groups concerned is >~ 10 and severe compression is needed to
> see "the wood for the trees". With fewer groups, more detail
> is often tolerable and even highly desirable.
>
> (Incidentally, an example which I owe to a Howard Wainer paper is
> instructive. How do you interpret a box plot like this?
>
> +---------------+------------------+
> +----| | |----+
> +---------------+------------------+
>
> Most people I have asked go for a diagnosis of a short-tailed
> distribution. This forgets that if the half the distribution
> is inside the box, then the other half must be outside. In
> this case, the average density in the tails must be much higher than
> in the centre and the best guess has to be a U-shaped distribution.
>
> Boxplots can be much harder to interpret than you think!)
>
> Alternatives such as -dotplot- (official),
> -onewayplot- (user-written), -beamplot- (user-written)
> dedicated to the idea of showing one symbol for every
> data point have the advantage that a clear impression of
> the number of data points is given.
>
> Yet further plots are possible showing all the quantiles.
> -quantile- (official) is here less flexible than -qplot-
> (user-written). The next issue of the Stata Journal will
> carry a long diatribe on quantile plots and will
> be accompanied by an enhanced version of -qplot- (and
> also of -distplot-, also user-written).
>
> A yet further possibility is to hybridise dot and box
> plots. One example is given by Wild and Seber, "Chance
> encounters" p.122.
>
> Fortuitously, just this afternoon a colleague and I
> came up with our own hybrid. This assumes a categorical
> variable coded by successive integers. I wouldn't defend
> this default design to the limit as it was entirely
> optimised for one particular dataset. However, the main
> point is that your own hybrid design is attainable
> with some coding. (Varying width boxes do sound a bit
> harder.)
>
> For this to work, you need -onewayplot- from SSC.
>
> Silly example:
>
> . sysuse auto
> . myboxplot mpg rep78, magic(-0.2) rbar(barw(0.15))
> ysc(noreverse) stack h(0.5)
>
> *! 1.0.0 NJC/ISE 23 Aug 2005
> program myboxplot
> version 9
> syntax varlist(min=2 max=2 numeric) [if] [in] ///
> [, magic(real 0.4) rbar(str asis) * ]
> marksample touse
> qui count if `touse'
> if r(N) == 0 error 2000
>
> tokenize `varlist'
> args y cat
>
> tempvar median upq loq offset
> qui {
> egen `median' = median(`y') if `touse', by(`cat')
> egen `loq' = pctile(`y') if `touse', by(`cat') p(25)
> egen `upq' = pctile(`y') if `touse', by(`cat') p(75)
> gen `offset' = `cat' + `magic'
> }
> onewayplot `y' if `touse', by(`cat') msy(+) msize(small) ///
> plot(rbar `upq' `median' `offset', ///
> barw(0.25) blcolor(black) bcolor(gs14) hor legend(off)
> `rbar' ///
> || rbar `loq' `median' `offset', ///
> barw(0.25) bcolor(gs14) blcolor(black) hor `rbar') ///
> xti("`: variable label `y''") ysc(reverse) yla(,
> noticks) yti("") `options'
> end
>
>
>
>
>
> Nick
> [email protected]
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/