Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: IVREG2 vs. REG/CLUSTER2 – Difference in number of observations reported in regressions
From
"Hofbaur, Ulrich" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: IVREG2 vs. REG/CLUSTER2 – Difference in number of observations reported in regressions
Date
Mon, 10 Sep 2012 12:50:49 +0000
Hi everybody,
I am using the "ivreg2"-command to run an OLS-regression model and simultaneously allow for 2-way clustering in the SE-terms. So, the command is simply “ivreg2 y x, cluster(cs_id ts_id)”. As it turns out STATA drops some observations (about 30 percent of the total sample; it keeps 1806 instead of 2618 obs.) when conducting the regression although the information is available. I have also tried the related "cluster2"-command and the ordinary "reg"-command. However, these commands are using the full set of observations. Does anyone know why this difference in number of observations reported in the regressions shows up?
Help highly appreciated!
Best,
Ulrich
The code is given below.
---------------------------------------------
. reg car_m1_1 dv_chng
Source | SS df MS Number of obs = 2618
-------------+------------------------------ F( 1, 2616) = 69.36
Model | .359573378 1 .359573378 Prob > F = 0.0000
Residual | 13.5615124 2616 .005184064 R-squared = 0.0258
-------------+------------------------------ Adj R-squared = 0.0255
Total | 13.9210858 2617 .005319483 Root MSE = .072
------------------------------------------------------------------------------
car_m1_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dv_chng | .0602397 .0072331 8.33 0.000 .0460565 .0744228
_cons | -.0057141 .003311 -1.73 0.084 -.0122064 .0007783
------------------------------------------------------------------------------
. ivreg2 car_m1_1 dv_chng
OLS estimation
--------------
Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only
Number of obs = 1806
F( 1, 1804) = 35.82
Prob > F = 0.0000
Total (centered) SS = 10.15831896 Centered R2 = 0.0195
Total (uncentered) SS = 12.13900806 Uncentered R2 = 0.1795
Residual SS = 9.960554106 Root MSE = .07426
------------------------------------------------------------------------------
car_m1_1 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dv_chng | .0536313 .0089563 5.99 0.000 .0360774 .0711853
_cons | -.0098285 .0042637 -2.31 0.021 -.0181852 -.0014719
------------------------------------------------------------------------------
Included instruments: dv_chng
------------------------------------------------------------------------------
. ivreg2 car_m1_1 dv_chng, cluster(cs_id ts_id)
OLS estimation
--------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on cs_id and ts_id
Number of clusters (cs_id) = 1149 Number of obs = 1806
Number of clusters (ts_id) = 44 F( 1, 43) = 24.80
Prob > F = 0.0000
Total (centered) SS = 10.15831896 Centered R2 = 0.0195
Total (uncentered) SS = 12.13900806 Uncentered R2 = 0.1795
Residual SS = 9.960554106 Root MSE = .07426
------------------------------------------------------------------------------
| Robust
car_m1_1 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dv_chng | .0536313 .0106425 5.04 0.000 .0327725 .0744902
_cons | -.0098285 .0052442 -1.87 0.061 -.020107 .0004499
------------------------------------------------------------------------------
Included instruments: dv_chng
------------------------------------------------------------------------------
. cluster2 car_m1_1 dv_chng, fcluster(cs_id) tcluster(ts_id)
Linear regression with 2D clustered SEs Number of obs = 2618
F( 1, 2499) = 64.84
Prob > F = 0.0000
Number of clusters (cs_id) = 1568 R-squared = 0.0258
Number of clusters (ts_id) = 51 Root MSE = 0.0720
------------------------------------------------------------------------------
car_m1_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dv_chng | .0602397 .0091375 6.59 0.000 .0423219 .0781575
_cons | -.0057141 .0040381 -1.42 0.157 -.0136324 .0022042
------------------------------------------------------------------------------
SE clustered by cs_id and ts_id (multiple obs per cs_id-ts_id)
. count if car_m1_1!=. & dv_chng!=. &cs_id!=. & ts_id!=.
2618
. distinct cs_id
| Observations
Variable | total distinct
--------------+----------------------
cs_id | 2618 1568
. distinct ts_id
| Observations
Variable | total distinct
--------------+----------------------
ts_id | 2618 51
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/