st: -xtreg, re- vs -regress, cluster ()-

From   Enrica Croda
To   [email protected]
Subject   st: -xtreg, re- vs -regress, cluster ()-
Date   Thu, 5 Dec 2002

Hello Stata-listers:

I am a bit puzzled by some regression results I obtained using -xtreg, re-
and -regress, cluster()- on the same sample.

I would appreciate if anybody out there could give me feedback on whether
it possible to obtain the same coefficient estimated by using -regress,
cluster(ID)- and -xtreg, re i(ID)- on the same specification on
the same sample, and if there are common circumstances in which this may

As far as the specifics of my case, I am studying labor force
participation of married women.
I am using a balanced panel data-set in "long form" (iis: ID, tis year)
containing yearly data for the period 1990-1997.
I have a total of 8696 observations on 1087 married women.

The dependent variable is a binary variable with values 1 or 0.

I run
1) pooled OLS regressions with the cluster option (-regress, cluster(ID)-,
2) -xtreg, re i(ID)-
on the same specification.

If I use a static specification and do not include any lagged variable
among the explanatory variables, applying the 2 different estimation methods
produces different coefficient estimates and different standard errors.
And this is what I was expecting.

What is puzzling me is the following.

If I use a dynamic specification, i.e. basically I include the lagged
value of the dependent variable among the explanatory variables, applying
the two different estimation methods produces exactly the same
coefficient estimates and different standard errors. (Estimation results
I was not expecting the coefficient estimates to be exactly the same with
the two methods.

I tried other panel regressions.
-xtreg, mle- provides different estimates and standard errors from -xtreg,

I also tried to construct the random effects estimates by running a pooled
regression on the quasi-differences specification (4) in Volume 4 of
the Stata 7 Manual, p.437, with theta estimated as described on p. 452,
and I got yet different results.

I am reporting below the estimates obtained with
I.  -regress, cluster(ID)-
II. -xtreg, re i (ID)-
III.-xtreg, mle i (ID)-

Variable definition:
curremplo: current employment status
lagemplo : lagged employment status
perminc  : husband's permanent income
transinc : husband's transitory income
age      : age/10
agesq    : (age/10) squared
sak02    : number of kids aged 0-2
sak35    : number of kids aged 3-5
sak02    : number of kids 6+
east     : dummy variable =1 if respondent is East German (the data
           are for East and West Germany)
schoolmax: maximum years of schooling
yr##     :year dummy, equal to 1 if year is ## (##=91,...97).


. regress curremplo perminc transinc sak02 sak35 sak6g lagemplo age agesq east
> schoolmax yr91 yr92 yr93 yr94 yr95 yr96 yr97, cluster(persnr);

Regression with robust standard errors                 Number of obs =    8696
                                                       F( 17,  1086) =  411.72
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.5388
Number of clusters (persnr) = 1087                     Root MSE      =  .32573

             |               Robust
   curremplo |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
     perminc |   -.003359   .0016812    -2.00   0.046    -.0066579   -.0000602
    transinc |  -.0029873   .0017223    -1.73   0.083    -.0063667    .0003921
       sak02 |  -.1735915   .0155283   -11.18   0.000    -.2040605   -.1431226
       sak35 |  -.0343057   .0091977    -3.73   0.000    -.0523531   -.0162584
       sak6g |  -.0222673   .0047493    -4.69   0.000    -.0315862   -.0129483
    lagemplo |   .6713014    .012667    53.00   0.000     .6464469    .6961559
         age |    .010654   .0038414     2.77   0.006     .0031165    .0181915
       agesq |   -.000187    .000048    -3.89   0.000    -.0002813   -.0000927
        east |   .0453875   .0097331     4.66   0.000     .0262897    .0644853
   schoolmax |   .0051449   .0018325     2.81   0.005     .0015493    .0087405
        yr91 |   -.031073   .0159144    -1.95   0.051    -.0622995    .0001534
        yr92 |  -.0133491   .0143174    -0.93   0.351     -.041442    .0147438
        yr93 |    -.02965     .01378    -2.15   0.032    -.0566885   -.0026115
        yr94 |  -.0042043   .0134346    -0.31   0.754     -.030565    .0221563
        yr95 |   -.010533    .013451    -0.78   0.434    -.0369259    .0158599
        yr96 |  -.0319808   .0135433    -2.36   0.018    -.0585548   -.0054069
        yr97 |  -.0140815   .0134361    -1.05   0.295    -.0404453    .0122822
       _cons |     .09109    .073401     1.24   0.215    -.0529337    .2351137


. xtreg curremplo perminc transinc sak02 sak35 sak6g lagemplo age agesq east
>  schoolmax yr91 yr92 yr93 yr94 yr95 yr96 yr97, i(persnr) re;

Random-effects GLS regression                   Number of obs      =      8696
Group variable (i) : persnr                     Number of groups   =      1087

R-sq:  within  = 0.0984                         Obs per group: min =         8
       between = 0.9408                                        avg =       8.0
       overall = 0.5388                                        max =         8

Random effects u_i ~ Gaussian                   Wald chi2(17)      =  10137.10
corr(u_i, X)       = 0 (assumed)                Prob > chi2        =    0.0000

   curremplo |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
     perminc |   -.003359   .0013716    -2.45   0.014    -.0060473   -.0006708
    transinc |  -.0029873    .002286    -1.31   0.191    -.0074678    .0014932
       sak02 |  -.1735915   .0125305   -13.85   0.000    -.1981509   -.1490322
       sak35 |  -.0343057   .0091113    -3.77   0.000    -.0521635   -.0164479
       sak6g |  -.0222673   .0044685    -4.98   0.000    -.0310254   -.0135091
    lagemplo |   .6713014   .0080104    83.80   0.000     .6556013    .6870015
         age |    .010654   .0038709     2.75   0.006     .0030672    .0182408
       agesq |   -.000187   .0000471    -3.97   0.000    -.0002792   -.0000947
        east |   .0453875   .0087905     5.16   0.000     .0281584    .0626166
   schoolmax |   .0051449   .0016204     3.18   0.001     .0019691    .0083208
        yr91 |   -.031073   .0139985    -2.22   0.026    -.0585096   -.0036365
        yr92 |  -.0133491   .0140428    -0.95   0.342    -.0408724    .0141743
        yr93 |    -.02965   .0140972    -2.10   0.035    -.0572799   -.0020201
        yr94 |  -.0042043   .0141534    -0.30   0.766    -.0319445    .0235358
        yr95 |   -.010533   .0142409    -0.74   0.460    -.0384447    .0173787
        yr96 |  -.0319808   .0143176    -2.23   0.026    -.0600429   -.0039188
        yr97 |  -.0140815   .0144083    -0.98   0.328    -.0423214    .0141583
       _cons |     .09109   .0777215     1.17   0.241    -.0612413    .2434213
     sigma_u |          0
     sigma_e |  .28993302
         rho |          0   (fraction of variance due to u_i)

. xtreg curremplo perminc transinc sak02 sak35 sak6g lagemplo age agesq east
> schoolmax yr91 yr92 yr93 yr94 yr95 yr96 yr97, i(persnr) mle;

Fitting constant-only model:
Iteration 0:   log likelihood = -6568.6464
Iteration 1:   log likelihood = -5790.8646
Iteration 2:   log likelihood = -5653.5493
Iteration 3:   log likelihood = -5646.3662
Iteration 4:   log likelihood = -5646.3369

Fitting full model:
Iteration 0:   log likelihood = -2559.0813
Iteration 1:   log likelihood = -2490.0659
Iteration 2:   log likelihood = -2461.6401
Iteration 3:   log likelihood = -2461.2976
Iteration 4:   log likelihood = -2461.2973

Random-effects ML regression                    Number of obs      =      8696
Group variable (i) : persnr                     Number of groups   =      1087

Random effects u_i ~ Gaussian                   Obs per group: min =         8
                                                               avg =       8.0
                                                               max =         8

                                                LR chi2(17)        =   6370.08
Log likelihood  = -2461.2973                    Prob > chi2        =    0.0000

   curremplo |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
     perminc |  -.0056741   .0023379    -2.43   0.015    -.0102562   -.0010919
    transinc |  -.0040303   .0020947    -1.92   0.054    -.0081358    .0000752
       sak02 |  -.2245123   .0133989   -16.76   0.000    -.2507737    -.198251
       sak35 |  -.0701418   .0101739    -6.89   0.000    -.0900823   -.0502013
       sak6g |  -.0407319   .0061695    -6.60   0.000    -.0528238     -.02864
    lagemplo |   .4443965   .0139782    31.79   0.000     .4169997    .4717933
         age |   .0100016   .0052861     1.89   0.058    -.0003589    .0203621
       agesq |  -.0002096   .0000642    -3.26   0.001    -.0003356   -.0000837
        east |   .0910558   .0149718     6.08   0.000     .0617116       .1204
   schoolmax |   .0081604   .0027614     2.96   0.003     .0027482    .0135726
        yr91 |  -.0255522   .0127857    -2.00   0.046    -.0506118   -.0004927
        yr92 |   -.011285   .0128852    -0.88   0.381    -.0365396    .0139695
        yr93 |  -.0259762     .01303    -1.99   0.046    -.0515146   -.0004379
        yr94 |  -.0032213   .0132016    -0.24   0.807    -.0290961    .0226534
        yr95 |  -.0055009   .0134236    -0.41   0.682    -.0318108    .0208089
        yr96 |  -.0257715   .0136532    -1.89   0.059    -.0525313    .0009883
        yr97 |  -.0123659   .0139074    -0.89   0.374    -.0396239    .0148922
       _cons |   .2832431   .1087164     2.61   0.009     .0701629    .4963232
    /sigma_u |   .1662792   .0073449    22.64   0.000     .1518834     .180675
    /sigma_e |   .2968988   .0025839   114.90   0.000     .2918345    .3019632
         rho |    .238768   .0173716                      .2060788    .2741066
Likelihood ratio test of sigma_u=0: chibar2(01)=  229.39 Prob>=chibar2 = 0.000


Thank you very much in advance for any idea,


