Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: sign test output
From
"Seed, Paul" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: sign test output
Date
Fri, 18 Jan 2013 10:48:53 +0000
Dear Statalist,
While the discussion of Nahla Betelmal's query has been interesting and
informative, one point seems to have been missed: the question is ill-defined.
It appears that Nahla Betelmal has a variable that she wants/expects
for good theoretical reasons to have an average of 0; and wants to test
if this is true. We are not told any more.
If (s)he came to me for statistical advice, I would instantly want to know
- what the theoretical reasons were
- which average (the mean or the median) was expected to be 0
- how large a tolerance was acceptable
- what the implications would be if the average was not 0.
Until I had a clear understanding, I would not want start analysing data.
The second question is crucial. For a seriously non-normal distribution,
the mean and the median can be quite different, and it is possible
to construct examples where the mean is significantly > 0, while
the median is significantly < 0.
Normality checks would be mainly graphical, for the reasons discussed;
but I might look at measures of skewness, kurtosis and in particular compare
whether the mean and median were sufficiently close for it not to matter which
I used. (Estimates of the mean are usually more robust, so with low skew and
mean close to median, I might prefer to use the mean even if the median were
the main object of interest.)
Assuming interest was in the mean, I would advise one or more of
one-sample t-test (quick simple, and usually sufficient)
linear regression with robust standard errors (a basic correction for non-Normality)
bootstrapped linear regression with BCa confidence intervals, (a fuller correction,
that can give asymmetrical CI where appropriate, e.g. in cases of extreme non-Normality).
All methods are well described in the Stata manual, and usually give very similar answers
(except for extreme cases of non-Normality).
If interest was in the median, and I didn't trust the Normal approximation,
I would use the -centile- command with the -cci- option to get a confidence interval
for the median.
In each case I would direct attention to the confidence interval, and to the question of whether
the answer was sufficiently close to 0 (As defined by the third question.)
All this assumes that the ultimate interest is in the answer to this question.
If it was just a preliminary to another analysis, or the answer was wanted for
some deduction that could be made from it, I would also look for other
ways of addressing the real question, whatever it might be.
On Jan 17, 2013, at 5:13, Nahla Betelmal <[email protected]> wrote:
> Again, thank you both for your comments.
>
> However, if normality test is proved to be useful only for huge sample
> as Maarten mentioned. How can we determine which test (i.e. parametric
> or non-parametric ) to be used for smaller sample size in hundreds?!
>
> I personally think it is irrational to run both t-test and sign test
> on the same sample and hope they both produce the same conclusion! and
> what if they don't!
>
> I will follow Nick's advise to look deeper in the data, but I still
> believe that there must be another way to give obvious solution to
> this situation.
>
> Thank you both again, I highly appreciate your kind help and time,
>
> Nahla
>
>
Paul T Seed, Senior Lecturer in Medical Statistics,
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners
(+44) (0) 20 7188 3642.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/