(I posted this to sci.stat.math since it's more of a statistical than
a Stata question. But all help is appreciated).
I'm interested in comments and advice on how to interpret the
parameters of a two stage least squares (2SLS) model with a negative
R2. The problem is discussed on the website of Stata at
http://www.stata.com/support/faqs/stat/2sls.html
To summarize for myself, 2SLS uses instrumental variables to model
the effects of righthand side endogenous variables. These instruments
are the values of the endogenous variables as predicted by the
exogenous variables in the model. When these instruments are
replaced by the endogenous variables themselves, the predicted
values can in some cases be way off, so much so that the residual SS
is greater than the total SS. This would mean that the model SS is
negative and hence that the R2 of the model is negative. This can
happen even though the model contains strong and significant
effects.
The faq referred to above states that a negative R2 need not be a
problem and that parameters can be safely interpreted if they are
significant with reasonably small standard errors: "What does it
mean when RSS is greater than TSS? Does this mean our parameter
estimates are no good? Not really. You can easily develop
simulations where the parameter estimates from two- stage are quite
good while the MSS is negative. Remember why we estimate two-stage
models. We are interested in the parameters of the structural
equation � the elasticity of demand, the marginal propensity to
consume, etc. If our two-stage model produces estimates of these
parameters with acceptable standard errors, we should be happy �
regardless of MSS or R2. If we were strictly interested in
projections of the dependent variable, then we should probably
consider the reduced form of the model."
My take would be that the model fits the data very poorly and that
the estimats should be regarded with exreme suspicion. This is
generally the advice for maximum likelihood models, only interpret
parameters of a model that fits the data well. A negative R2 would
mean that the model was mis- specified and should not be interpreted.
Comments? And could anything be inferred about the nature of the
misspecification?
John Hendrickx
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/