Thanks for this. I should add that the -separate- option is a relatively recent addition, so that people who installed -stripplot- previously may need to update using -ssc- or -adoupdate-.
Nick
[email protected]
[email protected]
The -separate- option of -stripplot- is very nice, just what the
poster wanted. Another advantage of -stripplot- is that, like
-dotplot-, it will show when a boxplot may mislead, as when the data
are bimodal.
On Thu, Jul 16, 2009 at 9:11 AM, Nick Cox<[email protected]> wrote:
> I almost totally agree with Steve's advice. He uses the word Winsorize a
> little more widely than is standard. (By the way, I can assure anyone
> who reads that FAQ that the misbegotten word "gotten" did not appear in
> my original draft.)
>
> I'd favour making the omission of outliers a little more evident. In
> this and some other respects -stripplot, box- is more flexible than
> -graph box- or -graph hbox-. -stripplot- is downloadable from SSC.
>
> Consider as an example -price- in the auto dataset.
>
> sysuse auto
>
> clonevar price2 = price
>
> replace price2 = 14000 if price2 > 14000
>
> stripplot price2, over(foreign) box center stack width(250) ///
> xla(4000(2000)12000 14000 "outliers")
>
> gen outliers = price > 14000
>
> stripplot price2, over(foreign) box center stack width(250)
> xla(4000(2000)12000 14000 "outliers") ///
> separate(outliers) ms(oh S) legend(off)
>
> Nick
> [email protected]
>
> [email protected]
>
> Try the -nooutside- option or switch to another scale and show
> everything. See: Nick Cox's FAQ at
> http://www.stata.com/support/faqs/graphics/boxandlog.html . What he
> demonstrates can apply to scales other than the log.
>
> If you want to show some of the outside points, but not all, you will
> have to Winsorize the points you want to hide. Replace them with a
> value at the upper end of your desired graph range and give them an
> invisible marker symbol. This will leave the rest of the boxplot
> unchanged. You can add text at that value to show the number of
> higher points excluded.
>
> This problem comes up for other commands in which Stata computes the
> plotting points; -stcurve- is an example. Stata has a -range- option
> for axes, but it can only expand, not contract, the plotting range.
>
> On Thu, Jul 16, 2009 at 3:09 AM, Dana Chandler<[email protected]>
> wrote:
>
>> I am preparing some graphs with simple boxplots over various groups.
>> Thus on my x-axis, I have categorical variables for population groups.
>> My y-axis has # of businesses of a certain type within each population
>> group.
>>
>> Unfortunately, I would like to be able to only show the y-axis within
>> a certain range (so as to not have outliers distort the picture). One
>> idea I had was to simply do the graph and add "IF #businesses < 50".
>> This will make the graph visible, but will distort the IQR of the
>> boxplot. The "yscale(r(0 25))" command does not seem to work and seems
>> only to "extend" a range of y-values rather than restrict it. Does
>> anyone have a suggestion for how to construct a graph for the entire
>> range of data but only display it over a specific range?
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Steven Samuels
[email protected]
18 Cantine's Island
Saugerties NY 12477
USA
845-246-0774
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/