These results are not unexpected.
In the first example, the denominator of Kendall's tau-a, as defined in Equations (9), (10), (11), (12) (13), (14) and (15) of Newson (2006), will be equal to 4*4=16, because it is the mean of 10 values of 16 for each of the 10 clusters of 4 observations each. The numerator of Kendall's tau-a will be 4*3=12 (because the numerator terms comparing the same observation with itself are zero), and will once again be the sample mean of 12 values of 12 for all clusters. Therefore, the standard errors, and the covariance, of both the numerator and the denominator of Kendall's tau-a will be zero, and the estimate of Kendall's tau-a will be 12/16=0.75. By the way, Al need not have specified the wstrata(id) option, because funtype(wclass) ensures that comparisons are limited to within-cluster comparisons.
In the second example, Al has specified no funtype() option, causing the funtype() option to default to funtype(bcluster). As Al has also specified wstrata(id), Kendall's tau-a will be limited to comparisons that are both between clusters (because of funtype(bcluster)) an within clusters (because of wstrata(id) cluster(id)). As no comparisons are both between clusters and within clusters, both the numerator and the denominator of Kendall's tau-a will be zero.
I hope this helps.
Best wishes
Roger
References
Newson R. Confidence intervals for rank statistics: Somers' D and extensions. The Stata Journal 2006; 6(3): 309-334. Download pre-publication draft from
http://www.imperial.ac.uk/nhli/r.newson/papers.htm
Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
Opinions expressed are those of the author, not of the institution.
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Feiveson, Alan H. (JSC-SK311)
Sent: 19 March 2009 14:15
To: [email protected]
Subject: st: funny results with -cluster- in -somersd-
Hi - I've been using -somersd- to make within-stratum comparisons and also use the -cluster- option to account for repeated observations for each value of id. In this example, there are 10 vlaues of id, each replicated 4 times. Also in this data there is another variable u, which is a random uniform number - so all the values of u are distinct. Therefore, I would expect that if I make comparisons within strata, I should get tau_uu = 1. I thought that to account for the clusters in within-stratum comparisons I had to use -funtype(wcluster)-. But results are strange. Here is what I tried:
A)
. somersd u u , funtype(wcluster) cluster(id) wstrata(id) taua
Within-cluster Kendall's tau-a with variable: u
Transformation: Untransformed
Within strata defined by: id
Valid observations: 40
Number of clusters: 10
Symmetric 95% CI
(Std. Err. adjusted for 10 clusters in id)
------------------------------------------------------------------------------
| Jackknife
u | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
u | .75 . . . . .
u | .75 . . . . .
------------------------------------------------------------------------------
B)
. somersd u u , cluster(id) wstrata(id) taua
Kendall's tau-a with variable: u
Transformation: Untransformed
Within strata defined by: id
Valid observations: 40
Number of clusters: 10
Symmetric 95% CI
(Std. Err. adjusted for 10 clusters in id)
------------------------------------------------------------------------------
| Jackknife
u | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
u | (dropped)
u | (dropped)
------------------------------------------------------------------------------
C)
. somersd u u , funtype(wcluster) wstrata(id) taua
Within-cluster Kendall's tau-a with variable: u
Transformation: Untransformed
Within strata defined by: id
Valid observations: 40
Symmetric 95% CI
------------------------------------------------------------------------------
| Jackknife
u | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
u | (dropped)
u | (dropped)
------------------------------------------------------------------------------
D)
. somersd u u , wstrata(id) taua
Kendall's tau-a with variable: u
Transformation: Untransformed
Within strata defined by: id
Valid observations: 40
Symmetric 95% CI
------------------------------------------------------------------------------
| Jackknife
u | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
u | 1 . . . . .
u | 1 . . . . .
------------------------------------------------------------------------------
In case A, I get tau_uu = 0.75 (not 1).
In cases B and C (if either funtype or cluster are missing), I get zilch.
In case D, without any references to cluster, I get the "correct" value. But I want to account for the clusters- what am I missing?
Thanks
Al Feiveson
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/