|  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Collinearity in svy
At 08:21 AM 5/2/2008, Simon, Alan (CDC/CCHIS/NCHS) wrote:
The website essentially suggests using each variable as a dependent
variable in a separate regression  using all other variables as
independent variables, and then using the following command:
display "tolerance = " 1-e(r2) " VIF = " 1/(1-e(r2))
to calculate the Variance inflation factor.  However, this only seems to
work if the dependent variable is continous and the regression is OLS.
Is there a way to measure the variance inflation factor for categorical
variables in a complex survey design?  Or is there a better way to
approach this problem?
Personally, I see no problem with that.  Multicollinearity is a 
problem with the right hand side of the model, i.e. the Xs.  It 
doesn't matter whether Y itself will be analyzed via ols regression, 
logistic regression, or whatever.  For example, in a non-svy setting, 
if y was a dichotomy that you will be analyzing via logistic 
regression, it is nonetheless fine to do something like
regress y x1 x2 x3
vif
You are not interested in the coefficients from the regression, you 
are just interested in the collinearity diagnostics from vif.
One caveat: I am not sure if the use of svy somehow invalidates or 
complicates the usual collinearity diagnostics.  But at the same 
time, it is not like these diagnostics have to be accurate down to 12 
decimal places.  You usually just want to get a ballpark estimate of 
whether or not collinearity is a problem in your data.
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/