Tam Phan wrote:
> Given a model:
>
> Y = a + x(b) + z(d)+e
>
> Then, one takes the residuals e from this regression and regress it on
> a new set of explanatory variables, that is:
>
> e+mean(Y) = a1 + k(t)+v
>
> (note mean(Y) only affects the intercept a1)
>
> Any idea why this method is favored over:
>
> Y = a +x(b) +z(d) + k(t) + e? (which essentially is a one stage
> regression instead of the latter 2 stage)
I would regard these two modelling approaches as complementary rather
than competing. Regressing dependent variables can be useful if you
have a large number of interactions to test and you want to see which
ones are worth keeping in the full model.
After running your regression for the entire sample, you then run them
for subsections of your sample. If any of the effects are
significantly different for the selected group than for the entire
sample (i.e., if significant interaction effects (SIEs) exist), then
this would show up as a significant effect of one or more independent
variables on these residuals. Conversely, if no significant effects
are found, there is no SIE for the group in question, and there is
consequently no need to specify such an effect.
The last paper that I read employing this technique tested for 152
such effects, and only two of these were found to be significantly
different from zero. As finding one or two could occur by chance
alone, they rightly concluded that there was no compelling evidence
for heterogeneity
Doubtless there are other applications, but this would be the most
useful one for my purposes.
--
Clive Nicholas
[Please DO NOT mail me personally here, but at
<[email protected]>. Thanks!]
"Courage is going from failure to failure without losing enthusiasm."
-- Winston Churchill
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/