Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: sign test output
From
Nahla Betelmal <[email protected]>
To
[email protected]
Subject
Re: st: sign test output
Date
Thu, 17 Jan 2013 10:21:32 +0000
Dear Nick,
Thank you for the comments. the variable I am testing is not binary ,
and the literary of my field is concerned whether the mean (median) of
this variable is different than zero. So, U is the mean in case the
variable is normally distributed, or U is the median in case the
distribution is not normal.
from my readings in statistics , I know that in order to decide
whether to use parametric or non-parametric tests, the data normality
distribution should be checked first.
Shapiro-Wilk is used to test normality, when the number of
observations is less than 30. Otherwise, we should use
Kolmogorov-Smirnov for large sample (as in my sample).
So, when the test accepts the null (normality), we should use the
parametric test (i.e. t-test) which examines the mean. On the other
hand if the null of normality was reject, we should use the
non-parametric test ( sign test) instead which examines the median (As
in my case).
Also, for the comment about robust, I meant exactly what said (I used
the robust term loosely)
Thanks for suggesting to read again, sure I will do.
Many thanks again
Nahla
On 17 January 2013 09:49, Nick Cox <[email protected]> wrote:
> Your t-test is testing a quite different hypothesis. If the two states
> 0 and 1 of a binary variable have equal frequencies, then its mean is
> 0.5, not 0.
>
> That aside, the t-test can not be more appropriate for a binary
> variable than what you have done already, and this is predictable in
> advance, as a distribution with two distinct states is not a normal
> distribution. You do not need a Kolmogorov-Smirnov test to tell you
> that.
>
> For the record, what I suggested is best not described as a robust
> test. It was calculating a confidence interval, and I showed that for
> your data the result was robust to the method of calculation, meaning
> merely not sensitive. The word "robust" was used informallly.
>
> You never define what you mean by u, so I am not commenting on any
> details about u.
>
> I recommend that you read (or re-read) a good introductory text on
> statistics, as you appear confused on some basic matters.
>
> Nick
>
> On Thu, Jan 17, 2013 at 7:52 AM, Nahla Betelmal <[email protected]> wrote:
>
>> Thank you Maarten and Nick for the great help.
>>
>> So, in this case I would reject the null in favour of the alternative
>> u>0 as p value 0.000. However, using t-test on the same sample
>> provided the opposite (i.e. accept the null).
>>
>> ttest DA_T_1 == 0
>>
>> One-sample t test
>> ------------------------------------------------------------------------------
>> Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
>> ---------+--------------------------------------------------------------------
>> DA_T_1 | 346 1.564346 1.68628 31.36663 -1.752338 4.88103
>> ------------------------------------------------------------------------------
>> mean = mean(DA_T_1) t = 0.9277
>> Ho: mean = 0 degrees of freedom = 345
>>
>> Ha: mean < 0 Ha: mean != 0 Ha: mean > 0
>> Pr(T < t) = 0.8229 Pr(|T| > |t|) = 0.3542 Pr(T > t) = 0.1771
>>
>>
>> I think this is due to the distribution of the sample, so I performed
>> K-S normality test. It shows that data is not normally distributed,
>> hence I should use the non-parametric sign test instead of t-test. In
>> other words I would reject the null u=0 in favor of u>0 , right?
>>
>>
>> ksmirnov DA_T_1 = normal((DA_T_1-DA_T_1_mu)/ DA_T_1_s)
>>
>> One-sample Kolmogorov-Smirnov test against theoretical distribution
>> normal((DA_T_1-DA_T_1_mu)/ DA_T_1_s)
>>
>> Smaller group D P-value Corrected
>> ----------------------------------------------
>> DA_T_1: 0.4878 0.000
>> Cumulative: -0.4330 0.000
>> Combined K-S: 0.4878 0.000 0.000
>>
>>
>> N.B. Thank you so much Nick for the robust test you mentioned, I will
>> use that as well)
>>
>> Many thanks
>>
>> Nahla
>>
>> On 16 January 2013 09:33, Nick Cox <[email protected]> wrote:
>>> In addition, it could be as or more useful to think in terms of
>>> confidence intervals. With this sample size and average, 0.5 lies well
>>> outside 95% intervals for the probability of being positive, and that
>>> is robust to method of calculation:
>>>
>>> . cii 346 221
>>>
>>> -- Binomial Exact --
>>> Variable | Obs Mean Std. Err. [95% Conf. Interval]
>>> -------------+---------------------------------------------------------------
>>> | 346 .6387283 .0258248 .5856497 .6894096
>>>
>>> . cii 346 221, jeffreys
>>>
>>> ----- Jeffreys -----
>>> Variable | Obs Mean Std. Err. [95% Conf. Interval]
>>> -------------+---------------------------------------------------------------
>>> | 346 .6387283 .0258248 .5871262 .6880204
>>>
>>> . cii 346 221, wilson
>>>
>>> ------ Wilson ------
>>> Variable | Obs Mean Std. Err. [95% Conf. Interval]
>>> -------------+---------------------------------------------------------------
>>> | 346 .6387283 .0258248 .5868449 .6875651
>>>
>>> Nick
>>>
>>> On Wed, Jan 16, 2013 at 9:13 AM, Maarten Buis <[email protected]> wrote:
>>>> On Wed, Jan 16, 2013 at 9:38 AM, Nahla Betelmal wrote:
>>>>> I have generated this output using non-parametric test "one sample
>>>>> sign test" with null: U=0 , & Ua > 0
>>>>>
>>>>> However, I do not understand the output. where is the p-value? is it
>>>>> 0.5 in all cases or the 0.000 ( as in the first and third cases) and
>>>>> 1.000 as in the second case?
>>>>>
>>>>>. signtest DA_T_1= 0
>>>>>
>>>>> Sign test
>>>>>
>>>>> sign | observed expected
>>>>> -------------+------------------------
>>>>> positive | 221 173
>>>>> negative | 125 173
>>>>> zero | 0 0
>>>>> -------------+------------------------
>>>>> all | 346 346
>>>>>
>>>>> One-sided tests:
>>>>> Ho: median of DA_T_1 = 0 vs.
>>>>> Ha: median of DA_T_1 > 0
>>>>> Pr(#positive >= 221) =
>>>>> Binomial(n = 346, x >= 221, p = 0.5) = 0.0000
>>>>
>>>> The p-value is the last number, so in your case 0.0000. The stuff
>>>> before the p-value tells you how it is computed: it is based on the
>>>> binomial distribution, and in particular it is the chance of observing
>>>> 221 successes or more in 346 trials when the chance of success at each
>>>> trial is .5. For this tests this chance is the p-value, and it is very
>>>> small, less than 0.00005. If you type in Stata -di binomialtail(346,
>>>> 221, 0.5)- you will see that this chance is 1.381e-07, i.e.
>>>> 0.00000001381.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/