Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Clustered standard errors on the region * year level (-xtreg-)
From
"Tobias Pfaff" <[email protected]>
To
<[email protected]>
Subject
Re: st: Clustered standard errors on the region * year level (-xtreg-)
Date
Fri, 16 Sep 2011 16:43:09 +0200
Dear Austin,
I can't collect more regions (Western Germany, our focus, only has 10
states), and the German Federal Statistical Office doesn't provide GDP per
capita on a finer grid for the last 26 years.
My coefficient for GDP per capita is not significant, even without
clustering. So if I cluster on region with too few regions, I can assume
that there is a downward bias in the standard errors. Without the downward
bias the standard errors would be even larger and the coefficient even more
insignificant.
I guess that I would cluster on region in this case and would argue as above
concerning the coefficient of GDP per capita.
Cheers,
Tobias
-----Ursprüngliche Nachricht-----
> Date: Fri, 16 Sep 2011 10:01:27 -0400
> Subject: Re: st: Clustered standard errors on the region * year level
(-xtreg-)
> From: Austin Nichols <[email protected]>
> To: [email protected]
Tobias Pfaff <[email protected]>:
Yes, you need to cluster on region to allow for arbitrary correlation
with region over time, not region_year, but you have too few regions
to expect the downward bias in the cluster-robust SE to be negligible.
Collect more regions. Or "GDP per capita" on a finer grid. Or try
to model serial correlation within region, rather than adopt a robust
method which requires more clusters. A sensible strategy is to try a
few plausible models and pick the one that gives the largest SEs,
since we (researchers) invariably underestimate variability of
estimates.
On Fri, Sep 16, 2011 at 9:53 AM, Tobias Pfaff
<[email protected]> wrote:
> Hi,
>
> I do a fixed effects regression and wonder how I should cluster the
standard
> errors.
>
> Dependent variable: individual level
> Independent variables: individual level, GDP per capita on the regional
> level
> No. of regions: 10
> No. of individuals: 30,000
> No. of years: 26
> No. of obs.: 304,000
> Year dummies: yes
> Region dummies: yes
>
> Since one of my independent variables is aggregated at a higher level than
> the dependent variable I cluster on the region*year level (260 clusters):
>
> -xtreg depvar indepvars, fe vce(cluster region_year) nonest dfadj-
>
> ["region_year" was created with -egen region_year = group(region year)-]
>
> This works fine, but I'm not sure if the combination of region*year as
> definition of a cluster is OK with the fixed effects model, especially
when
> I include region and year dummies as well?
>
> Clustering only one the regional level would result in 10 clusters, which
is
> too few when the number of clusters has to go to infinity for the
> vce(cluster) estimation to work. Right?
>
> Any help is greatly appreciated!
>
> Thanks,
> Tobias
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/