I've made the mistake of trying to update my course notes on
multicollinearity and find that I'm bogged down when it comes to
eigenvalues and condition indices. Specifically, I've been comparing the
results produced by SPSS's -collin- parameter on its -regression- command
and the results produced by -collin- in Stata (which is an add-on module;
type -findit collin-). What I am finding is that
* If I first center the variables in SPSS (i.e. subtract the mean from each
variable) both Stata and SPSS produce identical results.
* If I don't center, SPSS gives me very different results. (Stata doesn't
change whether I center or not.)
* If I am reading the ado file right, Stata -collin- computes eigenvalues,
etc. from the correlation matrix of the X's. Spss, on the other hand,
presents "the eigenvalues of the scaled and uncentered cross-products matrix."
* In SPSS's favor, it computes the same condition number as William Greene
does in his analysis of the Longley data (Econometric Analysis, 4th
Edition, p. 258).
* But in Stata collin's favor, it just seems terribly counter-intuitive to
me that adding or subtracting a constant from a variable is going to change
my conclusions about multicollinearity.
Now, I've read that there are different ways of computing the eigenvalues
and condition indices; can anybody offer any insights as to what should
generally be preferred or when you should prefer one approach over
another? Another text I have says that centered predictor variables are
typically used, but that isn't what SPSS is doing and it isn't what Greene
seems to be doing if I understand him right. Thanks for any info.