| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: hireg help
At 11:16 AM 5/18/2007, Austin Nichols wrote:
Richard--
Thanks for pointing out the genealogy of -hireg- and -nestreg-. As for
the second point, are you saying you don't have qualms about -hireg-
and -nestreg-? True enough that -hireg- is not -stepwise- per se, but
it is a tool for model selection based on change in R2 or significance
tests on variables or groups of variables, which is like stepwise
regression, though "more respectable" as you say... -nestreg- adds a
likelihood ratio test option to the Wald stat from -hireg- in addition
to supporting many other estimation commands (including svy commands),
but -stepwise- offers similar choices, and even has a -hierarchical-
option. The question is, to my mind, is the whole enterprise
statistically legitimate? Are the calculated standard errors in your
chosen model adjusted for the fact that you threw out the 17 other
variables that did not pass muster in the 7 other regressions you
estimated? Maybe _mtest should be built in... but I have a feeling
that a suitable Monte Carlo would reject these methods, even with some
marginal corrections.
I think there is a HUGE difference between the mindless empirical
atheoretical selection of variables and the specification and testing
of a theoretically derived logical sequence of models. Given enough
time, I imagine I could come up a couple thousand citations of decent
articles that used nestreg or its equivalent. Further, I bet that
estout and the like make a good chunk of their money from presenting
side by side comparisons of the results of nested models. Perhaps
Ben Jann can check his business records on this. :)
I think there are some false premises in your argument. Sure, people
shouldn't just cherry-pick their results, doing dozens of runs and
only presenting the ones that came out significant. But heck, people
were doing that long before anybody ever thought of nestreg. There
is nothing about nestreg that makes it more likely or less likely
that you're only going to get selective presentations of results.
Second, I don't see any complaints about nestreg that couldn't also
be made about test or lrtest or even just looking at a bunch of
individual t-values for coefficients. If you are running a bunch of
tests, you may want to use more stringent significance levels, e.g.
.01 or a Bonferroni adjustment or whatever. nestreg is hardly unique
in that respect.
Third, you say that nestreg "is a tool for model selection based on
change in R2..." I think that is often a secondary
consideration. People who run sequences of models are often more
interested in how coefficients change as you go from one model to the
next. For example, if race is highly significant in block 1, but
insignificant in block 3, then that may suggest that the effects of
race are indirect, e.g. race affects education which in turn affects
income. Or, if X significantly affects Y in Block 1 but the effect
of X becomes insignificant in Block 2 after Income is added, then
that may suggest that the relationship between X and Y is spurious
and produced by the common cause of income.
To the extent that nestreg is used for model selection, you are
usually doing something like moving from a simple model to an
increasingly complex model, looking for the most parsimonious model
you can justify. Sometimes the blocks are ordered temporally (e.g.
characteristics determined at birth like sex and race, followed by
vars determined later in life, such as education, followed by more
immediate vars). By going through a sequence of models, you may get a
feel for how much your life's fate was determined at birth and how
much it was affected by later developments. Or, vars might be
ordered by content, e.g. demographic vars in one block, attitudinal
vars in another. Do attitudinal vars really gain us that much over
what we can get just from demographic vars alone?
In sum, I think specifying and testing a logical hierarchy of models
can be extremely informative and useful. It isn't just data
drudging, it is theory testing. And it can give us a lot of insights
that just testing one final model could lead us to overlook. Sure,
there can be abuses, but anything you could do wrong with nestreg you
could just as easily do wrong some other way. If anything, problems
may be less likely, in that using nestreg forces you to logically
think through what the sequence of models and tests should be, as
opposed to doing things on a more haphazard basis.
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/