Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Rounding Errors Stata 12
From
Marta García-Granero <[email protected]>
To
[email protected]
Subject
Re: st: Rounding Errors Stata 12
Date
Wed, 13 Feb 2013 17:18:41 +0100
Talking about rounding errors, I have found what I think it is a bug in
the way Stata manages sometimes tied differences before ranking them for
Wilcoxon's signed ranks test.
The sample data comes from exercise 1, chapter 1 of "Statistics at
Square One" (available as electronic resource here:
http://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/1-data-display-and-summary
)
I used this example for many years in my classes, both with hand
calculations and SPSS as statistical package (the one we had until
recently at my University). When I use Stata instead to test if the
population median is 0.6, I get different results:
. signrank cobre = 0.6
Wilcoxon signed-rank test
sign | obs sum ranks expected
-------------+---------------------------------
positive | 28 591.5 410
negative | 12 228.5 410
zero | 0 0 0
-------------+---------------------------------
all | 40 820 820
unadjusted variance 5535.00
adjustment for ties -0.75
adjustment for zeros 0.00
----------
adjusted variance 5534.25
Ho: cobre = 0.6
z = 2.440
Prob > |z| = 0.0147
SPSS (and I get the same result by hand)gives:
Ranks
N Mean Rank Sum of Ranks
Negative Ranks 28 21.00 588.00
Positive Ranks 12 19.33 232.00
Zero 0
Total 40
Test Statistics
Z -2.393
Asymp. Sig. (2-tailed) 0.017
As you can see, the rank sum (and, therefore, the Z statistic) are different
After a bit of experimenting, I have found that Stata is handling tied
differences involving opposite signs in a wrong way, but not
systematically. The last column (rank~100) has the correct ranks, while
ranked" contains the same values that Stata uses to get the positive and
negative sum of ranks. Notice the difference for cases 5/6/7, 18/19,
22/23/24, 29/30, 32/33. In all cases, the wrong ranking involves
differences with oppsotie signs, but this is not systematic (see cases
1/2, where the ties are recognized, or 11/12, 13/14...). I used "double"
in all the generated variables to avoid the known float problems.
generate double difs = (cobre-0.6)
generate double absdifs = abs(cobre-0.6)
egen double ranked = rank(absdifs)
generate double absdifs100 = 100*abs(cobre-0.6)
egen double ranked100 = rank(abs(round(absdifs100)))
sort absdifs
list cobre difs ranked ranked100
+----------------------------------+
| cobre difs ranked rank~100 |
|----------------------------------|
1. | .58 -.02 1.5 1.5 |
2. | .62 .02 1.5 1.5 |
3. | .63 .03 3 3 |
4. | .64 .04 4 4 |
5. | .55 -.05 5 6 |
|----------------------------------|
6. | .65 .05 6.5 6 |
7. | .65 .05 6.5 6 |
8. | .66 .06 8 8 |
9. | .52 -.08 9 9 |
10. | .69 .09 10 10 |
|----------------------------------|
11. | .7 .1 11.5 11.5 |
12. | .5 -.1 11.5 11.5 |
13. | .48 -.12 13.5 13.5 |
14. | .72 .12 13.5 13.5 |
15. | .73 .13 15 15 |
|----------------------------------|
16. | .74 .14 16.5 16.5 |
17. | .74 .14 16.5 16.5 |
18. | .45 -.15 18 18.5 |
19. | .75 .15 19 18.5 |
20. | .76 .16 20 20 |
|----------------------------------|
21. | .77 .17 21 21 |
22. | .42 -.18 22.5 23 |
23. | .42 -.18 22.5 23 |
24. | .78 .18 24 23 |
25. | .81 .21 25 25 |
|----------------------------------|
26. | .83 .23 26 26 |
27. | .36 -.24 27 27 |
28. | .85 .25 28 28 |
29. | .34 -.26 29 29.5 |
30. | .86 .26 30 29.5 |
|----------------------------------|
31. | .88 .28 31 31 |
32. | .3 -.3 32 32.5 |
33. | .9 .3 33 32.5 |
34. | .94 .34 34 34 |
35. | .98 .38 35 35 |
|----------------------------------|
36. | 1.04 .44 36 36 |
37. | .1 -.5 37 37 |
38. | 1.12 .52 38 38 |
39. | 1.16 .56 39 39 |
40. | 1.24 .64 40 40 |
+----------------------------------+
I must say that this is the only example where I found differences
between SPSS & Stata's output.
Regards,
Prof. Mart Garcia-Granero, PhD
Department of Biochemistry and Genetics
University of Navarra
SPAIN.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/