I would use the -somersd- package, downloadable from SSC using the -ssc-
command, to estimate a confidende interval for the difference between
the 2 probabilities
Pr(Y==1|X==1) and Pr(Y==1|X==0).
If your dataset has 1 observation per (X,Y)-pair (and therefore multiple
observations per subject), then you can type
somersd x y, transf(z) tdist cluster(subject) funtype(vonmises)
and this will produce a symmetric confidence interval for the hyperbolic
arctangent (or z-transform) of the difference (denoted "Somers' D"), and
a more interesting asymmetric confidence interval for the difference
itself. A confidence interval is more informative than a P-value,
because a P-value only tells us how incompatible the data are with a
zero difference, whereas a confidence interval gives a range of possible
differences, with which the data ARE compatible.
I hope this helps.
Roger
Roger Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/pop
genetics/reph/
Opinions expressed are those of the author, not of the institution.
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Christoph
Vanberg
Sent: 04 July 2007 17:58
To: [email protected]
Subject: st: Chi-square test for differences in a binary outcome
Hello,
I want to test for an effect of a randomly administered treatment on a
binary variable y=0 or 1. My data consists of observations on the same
set of individuals in different conditions, thus the samples are not
independent. Specifically, three treatment conditions are applied to
each individual a random number of times. I want to compare outcomes
between subjects in two of these three conditions. Since the number of
times an individual is in these two conditions is random, it is not
balanced across subjects.
I am looking at data of the following form, where the fractions
represent (times y=1 is observed in condition x) / (times subject is
in condition x).
i: 1 2 3 4 .... N
x1: 1/2 1/1 1/1 2/3 .... 3/3
x2: 0/1 0/0 1/2 1/2 .... 1/4
As I understand it, McNemar's Chi-square test (mcc in Stata) tests
for treatment effects if you have paired observations, each with one
outcome. That is, it applies to
the following type of data, where i identifies matched pairs, x1 is a
dummy indicating y=1 in condition 1 and x2 is a dummy indicating y=1
in condition 2.
i: 1 2 3 4 .... N
x1: 1 1 1 0 .... 1
x2: 0 0 1 1 .... 0
This is pretty close to what I want to do, but the test does not apply
to my situation, where the same individual can be repeatedly observed
in the same condition.
Does anyone have a suggestion as to what type of nonparametric test
might be appropriate in such a case?
Thank you,
Christoph
--
--
_______________________________________________________
Christoph Vanberg, Ph.D.
Max Planck Institute of Economics
Strategic Interaction Group
Kahlaische Str. 10, D-07745 Jena, Germany
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/