Thanks to Kit Baum, there is now a new version of the -somersd- package
available for download from SSC. In Stata, use the -ssc- command to do this.
-somersd- is described as below on my website and discussed in Newson
(2002), and calculates confidence intervals for rank statistics, allowing
left-censored and/or right-censored data. The new version adds the
following improvements:
1. -somersd- can now be prefixed by -bootstrap:-, -by:-, -jackknife:-,
-statsby:-, and -svy jackknife:-.
2, There is now a -funtype()- (functional type) option, allowing the user
to specify within-cluster, between-cluster, and overall (or Von Mises)
versions of Somers' D and Kendall's tau-a. Previously, only between-cluster
versions were available. The parameters behind the Wilcoxon ranksum and
signrank tests are between-cluster versions of Somers' D, the parameter
behind the sign test is a within-cluster Somers' D, and the Gini
coefficient is a Von Mises Somers' D.
3. Cluster frequency weights can be specified using a
-cfweight(expression)- option. These cluster frequency weights must be the
same for all observations in a cluster, and imply that each cluster
represents a number of duplicate clusters equal to its frequency weight.
(The manual -somersd.pdf- has an example which demonstrates the usefulness
of cluster frequency weights when estimating confidence intervals for Gini
coefficients from a dataset with one observation per income group.)
4. There are now options -wstrata(varlist)- and -bstrata(varlist)-,
allowing the user to specify stratified versions of Somers' D and Kendall's
tau-a. These stratified versions measure correlation within strata
specified by the -wstrata()- variables and/or between strata specified by
the -bstrata()- variables. For instance, we can measure correlation between
an exposure variable and an outcome variable within strata defined by a
categorical confounder variable by typing
If there are multiple confounders, then they can be used to define a
quantitative propensity score for the exposure based on the confounders,
and the propensity score can be grouped. If the propensity score is named
-propscore-, then we might type
and show that the exposure not only predicts the outcome within propensity
groups, but also that, within propensity groups, the exposure predicts the
outcome better than the propensity score predicts the outcome.
5. The online help, and the .pdf manual -somersd.pdf-, have been updated to
describe and demonstrate these new features.
In the present version of -somersd-, the Mata code is less efficient than
it probably could be, given the power of Mata. I plan to improve this in
the next version.
Best wishes
Roger
References
Newson R. 2002. Parameters behind "nonparametric" statistics: Kendall's
tau, Somers' D and median differences. The Stata Journal 2(1): 45-64. A
pre-publication draft is downloadable from my website at http://phs.kcl.ac.uk/rogernewson.
------------------------------------------------------------------------------------
package somersd from http://www.kcl-phs.org.uk/rogernewson/stata9
------------------------------------------------------------------------------------
TITLE
somersd: Kendall's tau-a, Somers' D and median differences
DESCRIPTION/AUTHOR(S)
The somersd package contains the programs somersd and cendif, which
calculate confidence intervals for the parameters behind rank or
"nonparametric" statistics. The program somersd calculates confidence
intervals for Kendall's tau-a or Somers' D, and stores the estimates and
their covariance matrix as estimation results. The program cendif
calculates
confidence intervals for Hodges-Lehmann median differences (or other
percentile differences) between two groups. Kendall's tau-a is a
difference
between probabilities of concordance and discordance, and measures rank
order correlation. Somers' D is a parameter equal to zero under the null
hypothesis tested by the Wilcoxon or Mann-Whitney ranksum test, and
can be
used to calculate confidence intervals for Harrell's c index, for
areas under
receiver operating characteristic (ROC) curves, and for differences
between
Harrell's c indices or ROC areas. The Hodges-Lehmann median
difference can be
defined in terms of Somers' D, and is also zero under the null
hypothesis
tested by the ranksum test. Full documentation of the two programs
(including
methods and formulas) can be found in the ancillary files
somersd.pdf and
cendif.pdf, which can be viewed using the Adobe Acrobat Reader.
Author: Roger Newson
Distribution-date: 05 August 2005
Stata-version: 9
INSTALLATION FILES (click here to install)
somersd.ado
lsomersd.mlib
somersd.hlp
somers_p.ado
cendif.ado
cendif.hlp
tidot.mata
tidotby.mata
_u2jackpseud.mata
_v2jackpseud.mata
mu2v2jackestvar.mata
ANCILLARY FILES (click here to get)
somersd.pdf
cendif.pdf
------------------------------------------------------------------------------------
(click here to return to the previous screen)
--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
Division of Asthma, Allergy and Lung Biology
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom
Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: [email protected]
Website: http://phs.kcl.ac.uk/rogernewson/
Opinions expressed are those of the author, not the institution.