(Thanks for replying at great length yet at very short notice.)
Michael S. Hanson replied:
> I may not be understanding correctly what procedures you have
> undertaken. However, as I read your message, this seems to be exactly
> what one would expect, provided all the variables being correlated with
> E were included as regressors in the estimation that produced E. (Or
> they are linear combinations of the regressors.) By construction, the
> OLS residuals are uncorrelated with all the regressors -- this includes
> the lagged dependent variable (LDV). While an OLS regression may be
> statistically invalid because of endogenous regressors (what you are
> trying to test, IIUC), mathematically the OLS residuals will be
> uncorrelated with any and all X variables included as regressors in
> estimation. In other words, while the LDV theoretically may be
> correlated with the _error_, econometrically it will be uncorrelated
> with the _residual_.
Thanks for setting me straight on this. It just shows that you learn all
the time about OLS.
> I don't have my copy of Wooldridge handy, so I'm not certain what is
> meant by a "reduced test." Perhaps a "reduced form test"?
You're probably right, but given what you say below, I'll jettison this
approach, whatever it's called!
> The procedure you describe doesn't seem likely to produce a different
> result than above: assuming the LDV is regressed on the same X's as
> were used to construct E, the residual of this regression is just the
> part of the LDV that is not correlated with (or "explained by") the
> other X's. However, since E is already orthogonal to the LDV (by the
> above regression procedure), it is still orthogonal to the portion of
> the LDV that is not explained by the other regressors.
>
> What I think you may be missing is a set of Z's to serve as
> instruments for the LDV. With these Z's -- which are correlated with
> the LDV and which are not regressors in the original specification --
> you could undertake 2SLS estimation of the dependent variable.
I have these: they're in the form of (some of) my regional dummy variables
that I mentioned initially. Depending on which party's votes are being
modelled, two or three of these are correlated with the LDV and not with
the residual.
> (Actually, off-hand I'm not entirely certain how having a LDV as
> opposed to some other potentially endogenous regressor changes the
> appropriateness of this approach. Presumably your data are
> stationary.... Do any of your other regressors vary over time?)
The trends of my depvars are reasonably stationary. Sometimes they're less
so. I did estimate these models originally with -areg- (now a no-no with
the LDVs: although I'm tempted to go back to it). I assumed that since
-areg- demeans all the values in DV, this would make the DV
mean-stationary. I could be wrong there (I often am).
All of my other regressors vary: I have 10 time dummies; 8 continuous and
time-varying ones; and 1 ordinal and time-varying regressor. I won't bore
you with the content of these variables unless you're interested!
> If the candidate endogenous regressor is truly exogenous (or, in the case
> of a LDV, predetermined with respect to the error term), then 2SLS is
> inefficient (higher variance of your estimates) -- but if this variable
> is endogenous, then OLS is inconsistent. Hence you'll need some
> instruments -- some exogenous variables that are not regressors -- to
> investigate this question, for example via a Hausman test. There may
> be an alternative "reduced form" test, but it almost certainly has to
> involve some instruments -- likely in the second regression for the
> LDV. It comes down to whether the "other variables" in this regression
> are in any way distinct from "every X-var" in the first regression or
> not. If not, then I would not be surprised by your results.
As above, I have these instruments. However, which Hausman test do you
refer to? (Not the Durbin-Wu-Hausman test: that's for -ivreg-.) The only
other one I know of I tested in Stata, which was to save the residual (E),
include it in the next regression and run it (the idea being that if E is
significant, you have endogeneity). Here's what happened in one of them:
. reg mbldmpc lmbldmpc mb2-mb14 mbpollch lagconch laglabch lagldmch
cdmargin ldmargin ldmplace mbenp class e
Source | SS df MS Number of obs = 1144
-----------+------------------------------ F( 21, 1122) = .
Model | 118556.08 21 5645.52761 Prob > F = .
Residual | 0 1122 0 R-squared = 1.0000
-----------+------------------------------ Adj R-squared = 1.0000
Total | 118556.08 1143 103.723604 Root MSE = 0
----------------------------------------------------------------------------
mbldmpc | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
lmbldmpc | .613628 . . . . .
mb2 | (dropped)
mb3 | (dropped)
mb4 | (dropped)
mb5 | 6.206573 . . . . .
mb6 | 15.55366 . . . . .
mb7 | 2.151632 . . . . .
mb8 | 3.257221 . . . . .
mb9 | 7.533742 . . . . .
mb10 | 4.293655 . . . . .
mb11 | -3.628148 . . . . .
mb12 | 3.192982 . . . . .
mb13 | 6.681205 . . . . .
mb14 | 3.364225 . . . . .
mbpollch | .1160439 . . . . .
lagconch | -.1557356 . . . . .
laglabch | -.0261995 . . . . .
lagldmch | -.0651551 . . . . .
cdmargin | -.1094444 . . . . .
ldmargin | -.0599428 . . . . .
ldmplace | -.3796943 . . . . .
mbenp | 3.263684 . . . . .
class | .0070015 . . . . .
e | 1 . . . . .
_cons | 2.738171 . . . . .
----------------------------------------------------------------------------
Still, the dots look nice, don't they?
> Hope that helps. I could be way off base; it might help to see the
> specification of the regressions (at least the two you mention).
Your post certainly has helped to clarify some estimation issues and I
thank you for that.
The model above was one of them (but without the E). Another one dropped
LDMPLACE (party's finishing position at last poll). Of course, a simpler
solution to this problem would be to simply drop the LDV and enjoy an
easier life. But there's a lot of interesting, sexy stuff about that LDV
that's worth talking about in electoral terms (e.g., rates of decay over
time).
I hope all that helps you. :)
CLIVE NICHOLAS |t: 0(044)191 222 5969
Politics |e: [email protected]
Newcastle University |http://www.ncl.ac.uk/geps
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/