True, but the recipe there given does not include
separate plotting of data points beyond the 10% and
90% percentiles, which I imagine is often desired.
The following has more pretensions to generality.
There are some hooks to use, but this does not
cover the common cases of horizontal alignment
and box plots for several (similar) variables.
----------------------------------------- box1090.ado
*! NJC 1.0.0 23 Nov 2006
* examples: box1090 mpg, over(rep78) box(barw(0.3)) ms(oh)
* box1090 length, over(grade) ms(oh) yla(, ang(h)) xla(, noticks)
program box1090
version 8
syntax varname(numeric) [if] [in], over(varname) ///
[box(str asis) spike(str asis) * ]
local y "`varlist'"
marksample touse
markout `touse' `over', strok
quietly {
count if `touse'
if r(N) == 0 error 2000
tempvar catvar p10 p90 p25 p75 p50 out tag
foreach p in 10 25 50 75 90 {
egen `p`p'' = pctile(`y') if `touse', ///
by(`group') p(`p')
}
gen `out' = `y' if `touse' & (`y' < `p10' | `y' > `p90')
egen `catvar' = group(`over') if `touse', label
su `catvar', meanonly
local max = r(max)
egen `tag' = tag(`catvar') if `touse'
local yttl : var label `y'
if `"`yttl'"' == "" local yttl "`y'"
local xttl : var label `over'
if `"`xttl'"' == "" local xttl "`over'"
}
twoway ///
rbar `p50' `p75' `catvar' if `tag', ///
barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
rbar `p25' `p75' `catvar' if `tag', ///
barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
rspike `p10' `p25' `catvar' if `tag', ///
blcolor(green) `spike' || ///
rspike `p75' `p90' `catvar' if `tag', ///
blcolor(green) `spike' || ///
scatter `out' `catvar' if `touse', ///
yti("`yttl'") xti("`xttl'") legend(off) ///
xla(1/`max', valuelabel) `options'
end
--------------------------------------------------
Nick
[email protected]
Scott Merryman
> Nick [Cox] presented a way of doing this a couple weeks ago using
> -statsby- and
> -twoway-. See
>
> http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.0611/Author/article-307.html
Ingo Brooks
> > I would like to produce a box plot like figure. However, instead of
> > the adjacent values that are provided by Stata's -graph
> box- procedure
> > I would like to depict the 10% and 90% percentile. Is there a way to
> > do this in Stata?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/