Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Can Spearman's rho be used to measure of the degree of association between two binary variables ?
From
Dirk Enzmann <[email protected]>
To
[email protected]
Subject
Re: st: Can Spearman's rho be used to measure of the degree of association between two binary variables ?
Date
Mon, 21 May 2012 15:47:44 +0200
Marcos Vinicius asked, whether Spearman's rho can be used to measure the
degree of association between two binary variables, see:
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-836.html
to which he received very helpful answers (perhaps more than he
initially wanted to know):
- from Maarten Buis and Richard Williams
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-855.html
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-859.html
who discuss whether multicollinearity should be regarded as a problem at all
- and from Cameron McIntosh, Richard Stoll, and Roger Newson
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-837.html
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-838.html
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-858.html
who point at alternative measures for the association of two binary
variables.
Let me add two comments as to the use of tetrachoric (or polychoric)
correlation coefficients vs. Pearson (or Spearman) correlation
coefficients in the case of two binary variables:
1) Tetrachoric and Pearson correlations answer different questions: The
first estimates the correlation of two *latent* (quasi-continuous)
variables "behind" the observed dichotomous variables, thus assuming
that both variables are artificially dichotomous, whereas the Pearson
coefficients shows how the *observed* values correlate, thus assuming
naturally dichotomous variables.
2) Ledesma et al. (2011) (see:
http://openjournal.konradlorenz.edu.co/index.php/rlpsi/article/viewFile/459/463
) cited by Cameron write: "Stata gives their users a function based on a
work by Edwards and Edwards (1984), that is basically 'a very rough
approximation' and
is consequently unsuitable for many applications ..." (p. 182). This is
not quite correct: Stata computes the tetrachoric correlation "... by
using the Edwards and Edwards (1984) noniterative estimator as the
*initial* value" (see: [R] Base Referencence, p. 2196; bold face by me)
- the coefficient calculated by Stata is at least as precise as the
coefficient calculated by Ledesma et al.'s Vista-Tetrachor program:
* --- Example from Ledesma et al., p. 183 ------------------------
input y x ncases
1 1 203
1 0 186
0 1 167
0 0 374
end
expand ncases
tetrachoric x y
* --- End of Stata example ---------------------------------------
Dirk
========================================
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Rothenbaumchaussee 33
D-20148 Hamburg
Germany
phone: +49-(0)40-42838.7498 (office)
+49-(0)40-42838.4591 (Mrs Billon)
fax: +49-(0)40-42838.2344
email: [email protected]
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html
========================================
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/