Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: collinearity in categorical variables |
Date | Fri, 26 Apr 2013 14:29:57 +0100 |
I think you're mixing quotations from two or three debates that barely overlap. Whether polychoric or Spearman correlations are better suited to categorical data doesn''t seem related to collinearity in regression-type models. Even if (say) polychoric correlations appealed more, how would that affect your choice of predictors in the latter kind of model? I tend to look directly at correlation and scatter plot matrices and to think substantively about relationships. That doesn't rule out specific tools being helpful. Nick njcoxstata@gmail.com On 26 April 2013 13:58, Mitchell F. Berman <mfb1@columbia.edu> wrote: > Thank you for the reply. Yes, I see that for a single categorical variable > broken into dummy variables, collinearity between the dummy variables would > be zero. > But my question concerns correlation between related, similar, categorical > variables. > > If I have multiple similar categorical variables, for example: homebound, > uses a walker, home-health aide, lives in nursing home, these categorical > variables will move together though the data--- won't be identical for all > patients, but correlated. > > People mention standard VIF (which I know how to do), but the more thorough > answers imply this is not correct. > > This links suggests perturb (a module available for Stata, R, and SPSS) or > polychoric correlation > http://stats.stackexchange.com/questions/35233/how-to-test-for-and-remedy-multicollinearity-in-optimal-scaling-ordinal-regressi > > This link from talkstats suggests that polychoric correlations (available in > R) are preferable, because correlations calculated using pearson product > moment are invalid for categorical data. > http://www.talkstats.com/showthread.php/22996-Collinearity-Among-Categorical-Variables-in-Regression > > someone else suggested spearman correlation coefficient > http://www.statisticsforums.com/showthread.php?t=802 > > factor analysis > http://www.talkstats.com/showthread.php/13264-Collinearity-in-Logistic-Regression > > This is beyond my level of theoretical understanding. I was trying to get a > sense of what the experts on the Stata List server use. > > Thank you for any additional input. > > > Mitchell > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/