[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: MULTICOLLINEARITY & R-SQUARED

From	"Clive Nicholas" <[email protected]>
To	[email protected]
Subject	Re: st: MULTICOLLINEARITY & R-SQUARED
Date	Thu, 30 Oct 2003 18:20:12 -0000 (GMT)

Richard,

Thanks for your full reply to my thread. It's difficult to disagree with
most of what you say, but what I was attempting to demonstrate was what
happens to R^2 when the correlations between two or more statistically
significant X-variables of interest are most certainly *not* zero (say one
of 0.6). When this happens, R^2 is inflated, because not only is the
variation in Y partly explained by the unique contribution made to it by
X1 and X2, because also partly by the *overlap* (for the want of a more
precise expression!) between them.

As I said towards of my last thread, one of the desired aims is to build a
model of explanatory variables which demonstrate *total independence* of
each other. But, since we as social scientists attempt to model the
determinants of human behaviour, that's little more than a pious hope,
since there will inevitably be some inter-correlation between explanatory
variables. The example I put forward demonstates this, and also
invalidates the numerous futile attempts made by social scientists that X1
on its own contributed a certain proportion to the R^2 out of all the
significant X's.

C.

> At 05:09 AM 10/29/2003 +0000, Clive Nicholas wrote:
>>unlikely to vote Labour and vice versa. Because this overlap is carried
>>forward to the computation of R^2, R^2 has been upwardly biased.
>
> Thanks, but I'm afraid I still don't follow.  If the beta coefficients
> were
> all zero, R^2 would be zero.  Further, while the intercorrelations of the
> Xs may affect how large R^2 is, I don't see how that causes R^2 to be
> "upwardly biased", i.e. just because something causes R^2 to be bigger
> doesn't mean that it becomes biased towards a larger value.  I'm aware of
> various consequences of multicollinearity, e.g. large standard errors,
> large confidence intervals, increased likelihood of saying a coefficient
> does not differ from zero when it really does.  But, I don't remember ever
> hearing "upwardly biased R^2" as a problem.  But that doesn't mean I
> couldn't have missed it!  But multicollinearity does not cause regression
> coefficients to be biased (wildly variable from one sample to the next,
> maybe, but not biased) so I am not sure why it would cause R^2 to be
> biased.
>
> What I might say instead is, suppose you have two populations.  In both
> populations, the effects of the Xs on Y are identical.  But, in one
> population, the Xs are much more highly correlated with each other than
> they are in the other population.  This will likely cause the R^2 to
> differ
> between the 2 populations.  If you just compared R^2 between the two
> populations and not the actual coefficients, you could get a very
> misleading idea of the differences between the two populations.  These
> kinds of ideas are discussed in my "Evils of R^2" handout at
> http://www.nd.edu/~rwilliam/xsoc593/lectures/l16.pdf.
>
> -------------------------------------------
> Richard Williams, Associate Professor
> OFFICE: (574)631-6668, (574)631-6463
> FAX:    (574)288-4373
> HOME:   (574)289-5227
> EMAIL:  [email protected]
> WWW (personal):    http://www.nd.edu/~rwilliam
> WWW (department):    http://www.nd.edu/~soc
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

Yours,
CLIVE NICHOLAS,
Politics Building,
School of Geography, Politics and Sociology,
University of Newcastle-upon-Tyne,
Newcastle-upon-Tyne,
NE1 7RU,
United Kingdom.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: R-SQUARED AND XTGEE
  - From: Richard Williams <[email protected]>
- st: R-SQUARED AND XTGEE
  - From: "Clive Nicholas" <[email protected]>
- Re: st: MULTICOLLINEARITY & R-SQUARED
  - From: Richard Williams <[email protected]>

Prev by Date: Re: st: RE: RANDOM ID GENERATORS & BACKWARD SELECTION
Next by Date: st: why STATA
Previous by thread: Re: st: MULTICOLLINEARITY & R-SQUARED
Next by thread: st: Concession
Index(es):
- Date
- Thread