Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: using percentiles for length of whiskers in box plots
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
st: RE: using percentiles for length of whiskers in box plots
Date
Sun, 9 May 2010 17:43:21 +0100
This was already answered earlier the same day in a different thread
within
<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.1005/date/article-332.html>
Although the thread subject doesn't refer to box plots, the same key
reference was repeated, again on the same day, within
<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.1005/date/article-344.html>
in a thread whose subject does refer to box plots.
The key ideas are
1. -graph box- won't do this for you.
2. You need to calculate all the ingredients explicitly and write your
own code. That's not as difficult as might be feared, as this example
should show.
sysuse auto
foreach p in 5 25 50 75 95 {
egen p`p' = pctile(price), by(foreign) p(`p')
}
egen tag = tag(foreign)
// what follows is all one command
twoway rbar p50 p75 foreign if tag, barw(0.4) bcolor(ltblue)
blcolor(dknavy) ||
rbar p50 p25 foreign if tag, barw(0.4) bcolor(ltblue) blcolor(dknavy) ||
rspike p75 p95 foreign if tag, lcolor(dknavy) ||
rspike p25 p5 foreign if tag, lcolor(dknavy) ||
scatter price foreign if !inrange(price,p5,p95), legend(off)
ytitle("`: var label price'") xla(0 1, valuelabel)
The key reference was
SJ-9-3 gr0039 . . . . . . . . Speaking Stata: Creating and varying box
plots
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.
J. Cox
Q3/09 SJ 9(3):478--496 (no
commands)
explains how to use egen to calculate the statistical
ingredients needed for box plots and variations of box
plots; shows the use of twoway to then create the plots
Nick
[email protected]
Daniel Koralek
I was wondering if anybody had any advice into using some percentiles as
cutoffs for the length of whiskers in box plots. i.e., the default is a
distance of 1.5 times the IQR above the 75th percentile and 1.5 times
the IQR below the 25th percentile. I'm more interested in using some
percentile, such as where the 5th and 95th or 1st and 99th percentiles
of the data are.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/