Hi all,
I hope I can ask a fairly basic stats question. I have a variable that
i need to compare across two groups.
the summary stats for the variable NAN across the groups is as below.
The negative values are legitimate.
group | N mean p50 max
min skewness kurtosis
group1 | 2537 -77535 5278 19051350
-46844688 -11.23 311.1
group2 | 3031 -211373 4620 4609996
-32617714 -11.18 185.6
Total | 5568 -150391 4958 19051350
-46844688 -11.33 278.4
If a do a ttest on the log transformed data, is it appropriate to add
an arbitrary constraint to make the negative values positive? Is the
ttest indeed any good for this data, or should I be looking at some
non parametric tests.
to make the numbers more manageble is divide by 1000,000 and the
summary stats look like this
group N mean p50 max min skewness kurtosis
group1 2537 -.07753 .005278 19.05 -46.84 -11.23 311.1
group2 3031 -.2114 .00462 4.61 -32.62 -11.18 185.6
Total 5568 -.1504 .004958 19.05 -46.84 -11.33 278.4
Is it right to perform ttest on ln((NAN/1000000)+50) ? changing the
constant i add dosent seem to make a difference.
stats on ln((NAN/100000)+50) is as below
group N mean p50 max min
skewness kurtosis
group1 2537 4.604 4.605 4.78 3.973 -17.21 527.4
group2 3031 4.603 4.605 4.65 4.21 12.74 242.9
Total 5568 4.604 4.605 4.78 3.973 -15.94 469
There is still a large negative skewness coefficient. To me this
looks like not a situation for a ttest and I should be looking at
some non parametric test. Is that right?
The results from the ttest using the unpaired and unequal option,
using the untransformed and using ln((NAN/100000)+50) are as below
transformation t p 95% CI
None 3.25 .0011
53205.45-214470.8
log(50+var) 2.75 .0060
.000367 - .002185 ( I understand this has to be back transformed)
a ranksum test on the logtransformed NAN shows a z of 3.3999 with a p
of .0007.on the untransformed NAN it is 3.396 with p of .0007
so overall, there dosent seem to be any change in the conclusions,
what ever test I use. But is the ttest procedure appropriate?
You help is much appreciated.
--
thanks for your time
rich
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/