Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Stratifying after regression (with 2 interactions)


From   austin nichols <[email protected]>
To   [email protected]
Subject   Re: st: Stratifying after regression (with 2 interactions)
Date   Thu, 1 Dec 2005 16:26:04 -0500

I don't think you can conclude that standard errors would be smaller
using the pooled data and -lincom- or estimating separate regressions
by group.  Here is an example regressing blood pressure on age groups
and race groups:

. webuse nhanes2
. qui xi: regress bpd i.agegrp*i.race sex
. lincom _Irace_2+_IageXrac_2_2

 ( 1)  _Irace_2 + _IageXrac_2_2 = 0

------------------------------------------------------------------------------
     bpdiast |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   2.886063   .9669886     2.98   0.003     .9905782    4.781548
------------------------------------------------------------------------------

. xi: regress bpd i.race sex if agegrp==2
i.race            _Irace_1-3          (naturally coded; _Irace_1 omitted)

      Source |       SS       df       MS              Number of obs =    1622
-------------+------------------------------           F(  3,  1618) =   30.67
       Model |  12325.8838     3  4108.62793           Prob > F      =  0.0000
    Residual |  216728.047  1618  133.948113           R-squared     =  0.0538
-------------+------------------------------           Adj R-squared =  0.0521
       Total |  229053.931  1621  141.304091           Root MSE      =  11.574

------------------------------------------------------------------------------
     bpdiast |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    _Irace_2 |   2.892348   .9182858     3.15   0.002     1.091193    4.693502
    _Irace_3 |   -1.08313   2.070185    -0.52   0.601    -5.143656    2.977396
         sex |  -5.216742   .5758084    -9.06   0.000     -6.34615   -4.087333
       _cons |   87.29417   .9318357    93.68   0.000     85.46644     89.1219
------------------------------------------------------------------------------

In any case, choosing a model because it has smaller standard errors
is a good way to introduce bias (sort of publication bias writ small).

The question you should ask yourself is, Should the covariance matrix
be estimated separately for each group?  and the answer must come from
your theory of the data generating process, not your observations of
what gives you more statistically significant results.

On 12/1/05, Raoul C Reulen <[email protected]> wrote:
> Oke, let me clarify this:
> This is the model that I am using:
> > > [omitted]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index