Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Mann-whitney U test


From   Roger Newson <[email protected]>
To   [email protected]
Subject   Re: st: Mann-whitney U test
Date   Sun, 23 Jan 2005 14:33:10 +0000

At 01:25 23/01/2005, Ricardo wrote:
Thank Roger. I am familiar with this program and I
have used it before. So the test really test both
hypotheses: that the difference between the median is
zero, and that the degree of non-overlap of the two
populations is zero. i.e. whether the degree of
overlap between the two populations is significantly
different than would be expected by chance alone. Is
this correct?
No and yes. The Wilcoxon ranksum test does indeed test the hypopthesis that Somers' D is zero, where Somers' D is the difference between 2 probabilities, namely the probability that a randomly-chosen member of Subpopulation A has a higher outcome value than a randomly-chosen member of Subpopulation B and the probability that a randomly-chosen member of Subpopulation B has a higher outcome value than a randomly-chosen member of Subpopulation A. If these 2 probabilities are equal, then you can argue that (in Ricardo's words) "the degree of non-overlap of the two populations is zero". However, the Hodges-Lehmann median difference is not always the difference between the 2 subpopulation medians. The Hodges-Lehmann median difference is the median difference between 2 outcome values, assuming that the first is sampled at random from Subpopulation A and the second is sampled at random from Subpopulation B.

If the 2 sub-population distributions are different only in location, then the Hodges-Lehmann median difference is indeed the difference between the 2 subpopulation medians, because then the difference between 2 outcome values sampled independently from the 2 subpopulations is distributed symmetrically around the location difference, and the median difference is the mean difference, which is the difference between means, which is the difference between medians. However, the 2 subpopulations may differ in ways other than location, and then the difference between the 2 medians may be different from the Hodges-Lehmann median difference. I often get queries from users of my program -cendif- (part of the -somersd- package) asking why, in their data, the Hodges-Lehmann median difference is not the difference between the 2 medians.

I hope this helps.

Best wishes

Roger



--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: [email protected]
Website: http://phs.kcl.ac.uk/rogernewson/

Opinions expressed are those of the author, not the institution.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index