Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: IVREG2 vs. REG/CLUSTER2 – Difference in number of observations reported in regressions

From	"Hofbaur, Ulrich" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	st: IVREG2 vs. REG/CLUSTER2 – Difference in number of observations reported in regressions
Date	Mon, 10 Sep 2012 12:50:49 +0000

Hi everybody,

I am using the "ivreg2"-command to run an OLS-regression model and simultaneously allow for 2-way clustering in the SE-terms. So, the command is simply “ivreg2 y x, cluster(cs_id ts_id)”. As it turns out STATA drops some observations (about 30 percent of the total sample; it keeps 1806 instead of 2618 obs.) when conducting the regression although the information is available. I have also tried the related "cluster2"-command and the ordinary "reg"-command. However, these commands are using the full set of observations. Does anyone know why this difference in number of observations reported in the regressions shows up?

Help highly appreciated!

Best,
Ulrich

The code is given below.

---------------------------------------------
. reg car_m1_1 dv_chng

      Source |       SS       df       MS              Number of obs =    2618
-------------+------------------------------           F(  1,  2616) =   69.36
       Model |  .359573378     1  .359573378           Prob > F      =  0.0000
    Residual |  13.5615124  2616  .005184064           R-squared     =  0.0258
-------------+------------------------------           Adj R-squared =  0.0255
       Total |  13.9210858  2617  .005319483           Root MSE      =    .072

------------------------------------------------------------------------------
    car_m1_1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     dv_chng |   .0602397   .0072331     8.33   0.000     .0460565    .0744228
       _cons |  -.0057141    .003311    -1.73   0.084    -.0122064    .0007783
------------------------------------------------------------------------------

. ivreg2 car_m1_1 dv_chng

OLS estimation
--------------

Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only

                                                      Number of obs =     1806
                                                      F(  1,  1804) =    35.82
                                                      Prob > F      =   0.0000
Total (centered) SS     =  10.15831896                Centered R2   =   0.0195
Total (uncentered) SS   =  12.13900806                Uncentered R2 =   0.1795
Residual SS             =  9.960554106                Root MSE      =   .07426

------------------------------------------------------------------------------
    car_m1_1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     dv_chng |   .0536313   .0089563     5.99   0.000     .0360774    .0711853
       _cons |  -.0098285   .0042637    -2.31   0.021    -.0181852   -.0014719
------------------------------------------------------------------------------
Included instruments: dv_chng
------------------------------------------------------------------------------

. ivreg2 car_m1_1 dv_chng, cluster(cs_id ts_id)

OLS estimation
--------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on cs_id and ts_id

Number of clusters (cs_id) =      1149                Number of obs =     1806
Number of clusters (ts_id) =        44                F(  1,    43) =    24.80
                                                      Prob > F      =   0.0000
Total (centered) SS     =  10.15831896                Centered R2   =   0.0195
Total (uncentered) SS   =  12.13900806                Uncentered R2 =   0.1795
Residual SS             =  9.960554106                Root MSE      =   .07426

------------------------------------------------------------------------------
             |               Robust
    car_m1_1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     dv_chng |   .0536313   .0106425     5.04   0.000     .0327725    .0744902
       _cons |  -.0098285   .0052442    -1.87   0.061     -.020107    .0004499
------------------------------------------------------------------------------
Included instruments: dv_chng
------------------------------------------------------------------------------

. cluster2 car_m1_1 dv_chng, fcluster(cs_id) tcluster(ts_id)
 
Linear regression with 2D clustered SEs                Number of obs =    2618
                                                       F(  1,  2499) =   64.84
                                                       Prob > F      =  0.0000
Number of clusters (cs_id) =   1568                    R-squared     =  0.0258
Number of clusters (ts_id) =     51                    Root MSE      =  0.0720
------------------------------------------------------------------------------
    car_m1_1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     dv_chng |   .0602397   .0091375     6.59   0.000     .0423219    .0781575
       _cons |  -.0057141   .0040381    -1.42   0.157    -.0136324    .0022042
------------------------------------------------------------------------------
 
     SE clustered by cs_id and ts_id (multiple obs per cs_id-ts_id)
 

. count if car_m1_1!=. & dv_chng!=. &cs_id!=. & ts_id!=.
 2618

. distinct cs_id

              |        Observations
     Variable |      total   distinct
--------------+----------------------
        cs_id |       2618       1568

. distinct  ts_id

              |        Observations
     Variable |      total   distinct
--------------+----------------------
        ts_id |       2618         51

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: IVREG2 vs. REG/CLUSTER2 - Difference in number of observations reported in regressions
  - From: "Schaffer, Mark E" <[email protected]>

Prev by Date: Re: st: how to define a matrix with mathematical expression
Next by Date: Re: st: Trouble producing MNL table with outreg
Previous by thread: st: how to define a matrix with mathematical expression
Next by thread: st: RE: IVREG2 vs. REG/CLUSTER2 - Difference in number of observations reported in regressions
Index(es):
- Date
- Thread