Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Can Spearman's rho be used to measure of the degree of association between two binary variables ?
From
Richard Williams <[email protected]>
To
[email protected], [email protected]
Subject
Re: st: Can Spearman's rho be used to measure of the degree of association between two binary variables ?
Date
Mon, 21 May 2012 07:55:05 -0500
At 01:56 AM 5/21/2012, Maarten Buis wrote:
On Mon, May 21, 2012 at 12:19 AM, Marcos Vinicius wrote:
> I was conducting a multicollinearity diagnostic analysis for a
logistic regression using spearman correlation and VIF. Important
detail:All the covariates are binary variables.
Multicollinearity is never a problem, see e.g.:
<http://www.stata.com/statalist/archive/2010-07/msg00675.html>, so
there is nothing to diagnose. If you want to inspect the association
between binary covariates I would look at a table of odds ratios.
-- Maarten
I would have to disagree with that a bit. Sometimes multicollinearity
might reflect a mistake on the researchers part. For example, your
model includes education, income, and then you decide to include this
SES measure you find at the end of the codebook. If SES was computed
using income and education, you may have extreme or even perfect
multicollinearity.
Or, suppose you have a categorical variable, and you create dummies
out of it. If some categories have extremely small Ns (e.g. 2 cases)
you will get near-perfect collinearity. You may have to combine
categories or else drop some cases.
Suppose, too, that you have several items that basically measure the
same concept. You may be better off creating a scale from the items
or constraining them all to have the same effects.
I don't think I have ever seen it happen with Stata, but there might
be situations where multic makes it difficult for the model to
converge. If so, doing things like centering a variable before you
square it might help.
If you happen to be at the design stage of the study and you are
worried about multic, you may wish to collect a larger sample as
larger samples will reduce the standard errors.
I do think the problem is exaggerated. But, the researcher should be
aware that they may have done something stupid, that there may be
better ways to set the problem up, and that they may be able to avoid
the problem in the first place when they design their study.
Also, I would discourage simply dropping variables that seem to be
causing you problems, as that could lead to specification error,
which may be an even worse problem.
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/