Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Interpretation of Two-sample t test with equal variances?
From
David Hoaglin <[email protected]>
To
[email protected]
Subject
Re: st: Interpretation of Two-sample t test with equal variances?
Date
Wed, 20 Mar 2013 19:50:32 -0400
Jay,
If the way people teach boxplots is the (main) source of the
difficulty, I would not be inclined to blame the boxplot!
I'm not aware of an assumption that outliers are an issue. If the
data contain outliers, a boxplot will show them as individual points,
beyond the ends of the "whiskers." The aim is to show observations
that are "outside" and may need further scrutiny. People do refer,
incorrectly, to observations that are beyond the "fences" as
"outliers." In data from a normal distribution, however, much more
than 5% of small to moderate-sized samples contain one or more
"outside" observations.
I'm not sure what you mean by "the box ends up being too big" if the
data are light-tailed. I would expect the "whiskers" to be unusually
short.
A boxplot can do only so much. The display was not designed to reveal
bimodal or multimodal data. A dotplot would usually show that
structure easily.
David Hoaglin
On Wed, Mar 20, 2013 at 7:19 PM, JVerkuilen (Gmail)
<[email protected]> wrote:
> On Wed, Mar 20, 2013 at 3:22 PM, David Hoaglin <[email protected]> wrote:
>> Jay,
>>
>> I'm not aware that boxplots make any assumptions. They show what they
>> are intended to show. Their "performance" comes from the way people
>> interpret them. Boxplots of skewed data will tend to have certain
>> characteristics, boxplots of light-tailed data will have other
>> characteristics, and so on. Some patterns suggest bimodal data.
>
> Oh definitely they show what they were intended to show, and they are
> incredibly useful, but the way we teach them I think leads many folks
> down the garden path. The assumptions I'm thinking of include ones
> such as the largely unstated background assumption that outliers are
> an issue. I've become adept at recognizing when a boxplot is giving me
> a light tailed distribution because the box ends up being too big, but
> if you have multiple modes that will get blown away and they provide
> too much reduction.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/