At 16:37 18/03/04 +1100, Garry Anderson wrote:
Hi,
I am enquiring if a more appropriate method could be used please to
calculate the 95% CI of the difference between two proportions in the
-csi- command?
At the moment it is possible for the upper bound of the confidence
interval of the difference between two proportions to be greater than
1.0. I realize that the approximation that is used is not appropriate for
small sample sizes, however I think that reporting of results that are
impossible should be avoided.
One possibility is to use the -somersd- package, downloadable (complete
with a .pdf manual) from SSC, using its -transform()- option. The
difference between 2 proportions is a special case of Somers' D, and the
-somersd- package offers a choice of transformations appropriate for
Somers' D, notably the hyperbolic arctangent (or z) transformation or the
arcsine transformation. If -diseased- and -exposed- are 2 binary (0,1)
variables indicating disease and exposure, respectively, then Garry might type
somersd exposed diseased, tr(z)
or, alternatively,
somersd exposed diseased, tr(asin)
and get a confidence interval for the difference between the proportion of
exposed individuals with the disease and the proportion of unexposed
individuals with the disease, using a normalizing and variance-stabilizing
transformation.
However, it should be stressed that, with Garry's example, there is a zero
cell (for exposed noncases), so one of the proportions is either zero or
one, so a normalizing or variance-stabilizing transformation might be
inappropriate because the sample size is so low. In such circumstances, it
might be better to use the -exactcci- package to define a conservative
confidence interval for the odds ratio, which may have an infinite upper
limit or a zero lower limit. If Garry uses -findit- to find and install
the -exactcci- fackage and types
exactcci 5 1 0 4, exact
then the so-called "exact" confidence interval is generated. (Note,
however, that this confidence interval is conservative, not exact. It is
called "exact" because it uses the exact hypergeometric distribution to
calculate conservative confidence limits.)
I hope this helps.
Roger
--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom
Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: [email protected]
Website: http://www.kcl-phs.org.uk/rogernewson
Opinions expressed are those of the author, not the institution.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/