This repsonse is a little late, but I too have been interested in creating
customized versions of a box plot. For example, I have often wanted to show
the 10th and 90th percentiles for the whiskers -- it seems a more sensible
choice for presentation of a distribution (as opposed to analysis, or
identification of tails, which were probably the main original focus for box
plots). I realize I could accomplish most of what I want using built in
commands like rbar and rspike, but I was also curous about how much you can
get inside some graphics commands. I quickly discovered that the
calculations for the box plot are done within a .class file for bargraphs.
Rather than mess with that structure, I decided to look at an alternate
route -- through manipulating the (sort of hidden) graphics data files known
as sersets.
I managed to create boxredo.ado -- a program that allows you to respecify
the statistics used to identify the middle, hinges, and whiskers for a box
plot. It also allows to to turn off the outside values (also available
directly in graph box). It seems to work, but it might need more testing to
see whether it consistently picks out the proper sersets and values to
replace to change what the box plot draws.
The program works immediately after a box plot is displayed and must have
access to the Stata file that was used to create it (so it can do
alternative calculations). The box plot must have either an -over- or -by-
variable (not just a single box, although minor modifcation would allow it
to work in that situation) because I was too lazy to program it. It
allows you to use any statistic created by collapse and substitute that for
any of the 5 values used to define the box and whiskers. It doesn't do any
checking to make sure that they are sensible to use, it just does it. It
only displays the results, but you can save or export the graph afterward.
I may add some options in the future.
Before posting it to SSC when there may still be bugs or problems, I would
like to email it directly to anyone interested in testing it out. Please
email requests directly to me.
Michael Blasnik
[email protected]
----- Original Message -----
From: "Wallace, John" <[email protected]>
To: <[email protected]>
Sent: Wednesday, June 02, 2004 2:59 PM
Subject: st: RE: Adjacent values in graph_box
> Having found my wayward graphics manual, I've located the answer to my
first
> question: the default values are the calculations proposed by Tukey in his
> 1977 text, Exploring Data Analysis. The distance to the adjacent values
is
> a multiple of the IQR (3/2 * (75th% - 25th%)). It doesn't look like there
> is an explicit way to change this (make it 3SD, for example). Does anyone
> know if this is ado-able, or is the graphics engine that generates the
> boxplot not accessible this way?
>
> -JW
>
> -----Original Message-----
> From: Wallace, John [mailto:[email protected]]
> Sent: Tuesday, June 01, 2004 11:33 AM
> To: '[email protected]'
> Cc: Chung, John
> Subject: st: Adjacent values in graph_box
>
> Hi all
>
> I've misplaced my graphics manual at the moment, and I'm trying to figure
> out two things:
>
> 1) what are the default values for the adjacent values (location of
the
> end of the whiskers on a boxplot)
>
> 2) Can/how are they altered?
>
> I thought it was -graph box, cwhiskers(...)-, but that appears to be for
the
> appearance of the whiskers (line weight, cap appearance, etc)
>
> Thanks for any help,
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/