Thanks, Maarten. The two situations you explained make sense to me. There are two other situations: (3) X1 and X2 are inter-related with each other, but there is no clear direction of the relationship. This means there is no clear theory to identify which factor causes which. (4) X1 and X2 are not related theoretically, although statistically correlated.
My questions are: (1) How would we deal with the third situation?
(2) The fourth situation should be a multicollinearity problem, and what shall we do if findings from correlation and VIF tests are not consistent?
Welcome suggestions from others on this listserv too. Thanks,
Rong
----- Original Message ----
From: Maarten buis <[email protected]>
To: [email protected]
Sent: Fri, February 5, 2010 1:34:52 PM
Subject: Re: st: Multicollinearity test
--- On Fri, 5/2/10, Rosie Chen wrote:
> I have two models: one is to run the model
> with x1, x2, and x3 predictors, and the other is to take
> x3 out and run the model with the x1 and x2 only at the third
> level. In the first model, only x2 is statistically
> significant, but in the second model both x1 and x2 are
> significant after x3 is taken away. The second model's results
> make more sense than the results in the first model. I did a
> correlation test, and found that x3 highly correlated with x1
> (r coefficient >0.5 and p<0.01). But the VIF test of the first
> model (linear one-level model) does not show multicolinearity
> problem of the x3 variable (VIF value <2).
>
> My question is: should I use VIF test or the correlation
> test to identify the possible multicollinearity problem, if
> the two tests results are not consistent, as indicated above?
Neither. The real issue is whether you believe that x3 is an
intervening or a confounding variable. Consider a simplified
version of your model: a dependent variable y is influenced
by two variables x1 and x2, and that you are mainly interested
in x1.
x2 is an confounding variable when it also causes x1.
A classic example of that is that when one tries to explain
the birth rate in various areas with the number of storks in
these areas you will find a possitive effect if you don't
control for the degree of urbanization (rural areas have both
more storks and higher birth rate than urban areas). The
degree of urbanization is thus a confounding variable and
you will have to control for it regardles of whether that
results in a multicolinearity problem or not.
x2 is an intervening variable if it is caused by x1. A classic
example (in sociology) is that parental status influences the
education of the offspring, which in turn influences the
offspring's status. If we want to know what the effect of
parental status is on the status of the offspring, then we
would want to include this indirect effect through education.
This is the part of the effect of parental status that we can
explain, so we are double sure that that effect really exists.
If we control for education we would take away the part of the
effect that we understand and are left with only the unexplained
effect of parental status. So in case of intervening variables
we do not want to control for the intervening variable (unless
you want to decompose the total effect into direct and indirect
effects). Moreover, you don't want to control for intervening
variables regardless of whether there is a problem with
multicolinearity or not.
Hope this helps,
Maarten
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/