| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: logistic regression with orthogonal predictors
At 02:35 PM 5/15/2006, [email protected] wrote:
A colleague asked me about some results with logistic regression. He had
two predictors of a binary outcome, call them A and B. When used alone,
predictor A was significantly related to the outcome and predictor B was
not. Moreover, the correlation between A and B was zero. When the
outcome was regressed on the two predictors simultaneously using logistic
regression both were significantly related to the outcome. In effect, the
coefficient for predictor B became larger. However, when OLS regression
was used instead, the coefficients for each predictor were the same as
when entered alone, which is what one would expect.
To elaborate a bit on my last answer - in OLS, the variance of y is
the variance of y, i.e. it doesn't matter whether y is regressed on
X1, or X1 and X2, or X1 and X2 and X3 - the variance of y will be the
same in every case.
BUT, in logistic regression (also probit and others) the variance of
the underlying latent variable y* changes as you go from one model to
the next, i.e. the variance of y* will be different when y is
regressed on X1 than when it is regressed on X1 and X2. This is
because, in a logistic regression, the latent variable is normalized
by fixing its residual variance at about 3.29 (in probit it is fixed
at 1). Since the residual variance is fixed, as more vars are added,
the explained variance increases, and the total variance of y*
increased. In short, with logit and probit, your dv is a moving
target, i.e. its variance changes from one model to the next. Hence,
even when the Xs are uncorrelated, you see behavior such as was
described in the original message.
The handouts I cited earlier also show that, if you use RWLS (Rich
Williams's Least Squares - a little known method and deservedly so)
you can get the same sort of behavior in OLS, i.e. if you fix the
residual variance at a specific value (e.g. 3.29) then the
coefficient estimates behave in the same odd ways.
In short, you have to realize that a lot of the things we are used to
in OLS do not work the same way in logit and probit. In OLS, our DV
is an observed variable; in logit and probit, our DV is actually a
latent unobserved variable (all we see is the 0-1 dichotomy that is
caused by the undelrying latent variable.)
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: [email protected]
WWW (personal): http://www.nd.edu/~rwilliam
WWW (department): http://www.nd.edu/~soc
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/