|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Collinearity in svy
At 08:21 AM 5/2/2008, Simon, Alan (CDC/CCHIS/NCHS) wrote:
The website essentially suggests using each variable as a dependent
variable in a separate regression using all other variables as
independent variables, and then using the following command:
display "tolerance = " 1-e(r2) " VIF = " 1/(1-e(r2))
to calculate the Variance inflation factor. However, this only seems to
work if the dependent variable is continous and the regression is OLS.
Is there a way to measure the variance inflation factor for categorical
variables in a complex survey design? Or is there a better way to
approach this problem?
Personally, I see no problem with that. Multicollinearity is a
problem with the right hand side of the model, i.e. the Xs. It
doesn't matter whether Y itself will be analyzed via ols regression,
logistic regression, or whatever. For example, in a non-svy setting,
if y was a dichotomy that you will be analyzing via logistic
regression, it is nonetheless fine to do something like
regress y x1 x2 x3
vif
You are not interested in the coefficients from the regression, you
are just interested in the collinearity diagnostics from vif.
One caveat: I am not sure if the use of svy somehow invalidates or
complicates the usual collinearity diagnostics. But at the same
time, it is not like these diagnostics have to be accurate down to 12
decimal places. You usually just want to get a ballpark estimate of
whether or not collinearity is a problem in your data.
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/