Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: RE: median equality test for non normal variables
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: RE: RE: median equality test for non normal variables
Date
Wed, 26 May 2010 15:52:57 +0100
Stretching the point a bit wider, it's striking to note how simple
fallacies about descriptive statistics persist.
Thus in the last week I've come across two texts from reputable
publishers including statements of the form "mean, median and mode
coincide in unimodal (*) symmetric distributions but not otherwise".
0, 0, 1, 1, 1, 1, 3 : mean, median, mode all 1.
Binomial(10, 0.1): same story.
(0 .. 10)' , binomial(10, (0 .. 10)', 0.1)
1 2
+-----------------------------+
1 | 0 .3486784401 |
2 | 1 .7360989291 |
3 | 2 .9298091736 |
4 | 3 .9872048016 |
5 | 4 .9983650626 |
6 | 5 .9998530974 |
7 | 6 .9999908784 |
8 | 7 .9999996264 |
9 | 8 .9999999909 |
10 | 9 .9999999999 |
11 | 10 1 |
+-----------------------------+
* Statements omitting "unimodal" are also common.
Nick
[email protected]
Ronan Conroy
On 25 Beal 2010, at 17:04, Feiveson, Alan H. (JSC-SK311) wrote:
> Isn't it true that the Wilcoxon rank sum test is designed only for
> possibilities of one distribution being a translation of the other?
I don't think that this consideration was built into the design, but
clearly if the two distributions are or markedly different shapes (as
in the artificial example I gave) then a single statistic will not
capture the difference between the two groups which exists in two
dimensions: location and shape.
I think that the underlying null hypothesis of the Wilcoxon is
actually one of considerable practical interest: that the probability
that a random observation from one group will be greater than or equal
to a random observation from the other group is 0.5.
This hypothesis underlies comparisons of treatment effectiveness, for
example. Note that it does not specify scale units, simply
probabilities. This is a great advantage when we are measuring
outcomes using scales which do not map onto real life measures of
effect size, such as depression scales or pain scales.
Of course, if your data are measured on a scale with real life units
(blood pressure, money) then you are better off calculating the Hodges
Lehmann median difference, which gives a more meaningful measure of
effect size.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/