At 01:46 AM 10/29/2003 +0000, Clive Nicholas wrote:
(a) Whatever is judged to be the 'best' measure of R^2, one *must* keep in
mind that (i) high levels of intercorrelation between X-variables inflate
R^2 to artifically-high levels; and (ii) models deploying aggregate-level
data with large spatial units of analysis inevitably have knock-on
(upward) effects on R^2, regardless of its measurement;
I'm not sure I understand (a)(i) -- Two Xs could be perfectly correlated
with each other, and yet both could have zero correlation with Y. Can you
elaborate or give an example?
(b) Why should *anybody* attempt to build a regression model that hopes to
produce an R^2 of 100%? Anybody with half a brain on these matters will
tell you that if your model has yielded a 'perfect' R^2, something is
wrong (probably multicollinearity among two or three X-variables). When
will people learn to love *low* levels of R^2? Low levels means there is
more to explain, and thus stretches our academic imaginations by providing
us with more challenges as to what the missing key factors might be.
I agree that I would certainly be suspicious of a perfect R^2. But, there
may not be anything else to explain -- it could just be that some
percentage of what happens in the world is due to random, chance
factors. Also, while you are correct in saying that in practice there
will always be more to explain, an implication of that is that our models
inevitably suffer from omitted variable bias -- which probably means that,
not only have we failed to consider the effects of variables not included
in the model, we have likely mis-estimated the effects of the variables we
do have. So, I think an ideal goal is to make R^2 as high as it should be,
but no higher, i.e. get a perfectly specified model, and if that produces
an R^2 of .10 then so be it. If by some wild chance I ever did explain all
the variability in a variable, I'm sure I could find some new variable to
move on to, so I wouldn't be too worried about running out of challenges!
If only social scientists, psychologists and economists alike simply
focused on the theoretical and empirical validity and reliability of their
variables and modelled social reality as accurately as possible in order
to test theories about human behaviour, then this will tell us more than
what R^2 tells us about *anything!* :-)
I agree with that. The goal is correct model specification and R^2 may
tell you little about how well you have met that goal. But if you do all
these things you may find that a nice large R^2 comes along as an added bonus.
-------------------------------------------
Richard Williams, Associate Professor
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: [email protected]
WWW (personal): http://www.nd.edu/~rwilliam
WWW (department): http://www.nd.edu/~soc
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/