Garrett wrote
I'm estimating a regression on how changing political party platforms
affect
vote shares. I included country-specific dummy variables, and I'm also
using
robust clustered standard errors (clustering on countries) as there's
likely to
be (negative) correlation between parties in vote share.
I first estimated the model without clustering, first with areg, and
then with
regress and a set of dummy variables. As expected, the results were
identical.
However, when I add the cluster option it looks like Stata is making
different
corrections to the degrees of freedom in the t-test for statistical
significance in these models, as well as doing some other things
differently
He compared
Regression with robust standard errors Number of obs = 158
F( 5, 7) = .
Prob > F = .
R-squared = 0.1477
Number of clusters (ctrynum) = 8 Root MSE = 4.5572
Regression with robust standard errors Number of obs = 158
F( 5, 144) = 12.84
Prob > F = 0.0000
R-squared = 0.1477
Adj R-squared = 0.0707
Root MSE = 4.5572
(standard errors adjusted for clustering on ctrynum)
-----------------------------------------------------------------------
| Robust
vgain | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+---------------------------------------------------------
vgainone | -.1540838 .0951223 -1.62 0.107 -.3421002 .0339326
and wondered why the tail probs of the t-values were different.
. di 2*ttail(7,1.62)
.14926394
. di 2*ttail(144,1.62)
.10742019
In the first case, the t's are calculated with 7 d.f. (number of
clusters, 8, less one).
In the second case, they are calculated with 144 d.f. (which appears to
be 7 + (8-1) = 14 less than the 158 obs.)
I imagine the issue is that areg with absorb and cluster set to the
same variable recognizes what is going on; regress, with the country
dummies, does not figure out that this is the same as 'absorbing' the
country indicator.