Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Steven Samuels <sjsamuels@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: multicollinearity with survey data |
Date | Wed, 23 Feb 2011 17:20:55 -0500 |
Rachel, Your advice about collinearity is incorrect. 1. A test for zero correlation among predictors has no place in a study of collinearity. Natural correlation among predictors is expected. Perfectly collinear variables are those with a multiple correlation R-square of 1.0 when regressed on others; these are the types that tossed out by regression programs. Rather than "test" for multicolinearity (and I shouldn't have used that phrase), the proper approach is to evaluate how bad it is. The measures for doing so are the variance inflation factor (VIF) for each predictor, or equivalently, the multiple R for predicting that variable with the others. 2. Contrary to your belief, adding collinear variables can improve a model. Indeed if the goal is simply to get the best possible prediction of Y, then collinearity might be more or less irrelevant. The real problem caused by high multicollinearity is that it makes it difficult to interpret individual regression coefficients. For a treatment see any text on multiple regression. It is impossible to give blanket advice about what to do if high collinearity is found. Certainly dropping the most collinear variable is one option; but what if that is a predictor of interest? There is a large literature on this topic. Steve Steven J. Samuels Consulting Statistician 18 Cantine's Island Saugerties, NY 12477 USA Voice: 845-246-0774 Fax: 206-202-4783 sjsamuels@gmail.com On Feb 23, 2011, at 5:30 AM, rachel grant wrote: I am not an expert on this so correct me if I am wrong Stata Listers! In my models (negative binomial regression) Stata automatically checks for multicollinearity and omits colinear variables and then tells you it has done so. Multicollinearity just means that variables are highly correlated with each other so if you want to test for it, run a simple correlation test. Including colinear variables adds no new info to the model. Ifyou have several variables that are highly correlated with each other, you only need use one of these in the model. Rachel Rachel Grant Dept. Life Sciences Open University UK On 23 February 2011 05:03, Christine Gourin <cgourin1@jhmi.edu> wrote: > thank you! > how do you test for collinearity with survey data, however? > ________________________________________ > From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] On Behalf Of Steven Samuels [sjsamuels@gmail.com] > Sent: Tuesday, February 22, 2011 1:27 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: multicollinearity with survey data > >> On Feb 22, 2011, at 11:55 AM, Christine Gourin wrote: >> >> i have a question about how to check for multicollinearity with survey data. the only information I can find about this is at the site >> http://www.stata.com/support/faqs/res/statalist.html#toask >> >> I am using survey data to investigate variables associated with hospital volume (HVH) as the dependent variable. >> I suspect that teaching status (HOSP_TEACH) is collinear with HVH, as all HVH hospitals are teaching hospitals. >> >> I am not sure how to check for multicollinearity in the full model, which is >> >> >> xi: svy: logistic HVH elective i.agecat flap neckdissection i.procedure i.payor radiation HOSP_TEACH i.RACE i.comorbidity >> >> >> >> when I run this model, stata drops HOSP_TEACH saying it predicts failure perfectly. >> > > This message has nothing to do with multicollinearity. Multicollinearity concerns the correlations of predictors with each other. This message, refers to the association of outcome and one predictor. Tabulating HVH against HOSP_TEACH should show you the problem. > > > Steve > > Steven J. Samuels > Consulting Statistician > 18 Cantine's Island > Saugerties, NY 12477 USA > Voice: 845-246-0774 > Fax: 206-202-4783 > sjsamuels@gmail.com > > > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- regards, Rachel * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/