Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Problem with IV regression and two-way clustering

From	"Tobias Pfaff" <[email protected]>
To	<[email protected]>
Subject	RE: st: Problem with IV regression and two-way clustering
Date	Fri, 28 Sep 2012 16:11:25 +0200

Thanks Mark.
But what do you mean by "parametric approach"?

Regards,
Tobias


> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] "Schaffer, Mark E"
<[email protected]>
> Sent: Fri, 28 Sep 2012 12:23:38 +0100
> To: [email protected]
> Subject: Re: st: Problem with IV regression and two-way clustering

> Tobias,

> My reaction is that 14 clusters is too small.  Consistency of the
> cluster-robust VCE requires the number of clusters to go to infinity,
> and 14 is just not very far on the way to infinity.  You note that with
> a small number of clusters, the SEs are biased downwards, but the
> problem isn't just bias - you are going to get noisy estimates of the
> SEs, i.e., in repeated samples with 14 clusters they can be all over the
> place.

> You might instead want to investigate a parametric approach to the
> problem...?

> HTH,
> Mark

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> Tobias Pfaff
> Sent: Thursday, September 27, 2012 9:30 PM
> To: [email protected]
> Subject: Re: st: Problem with IV regression and two-way clustering
> 
> Dear Austin,
> 
> Yes, some individuals move across regions.
> If I do the IV regression with two-way clustering, I just 
> find it strange
> that the tests point to an invalid instrument, given the rather high
> correlation of the focus variable and the instrument.
> 
> Regards,
> Tobias
> 
> ________________________________________
> From Austin Nichols <[email protected]>
> To [email protected]
> Subject Re: st: Problem with IV regression and two-way clustering
> Date Thu, 27 Sep 2012 16:03:38 -0400
> ________________________________________
> 
> Are individuals moving across regions? If not, the pid clustering is
> subsumed in region, and you need only cluster at the region level.
> You might consider 2-d clustering by region and year as well.
> Clustering by pid is not enough; you have strong correlation of errors
> and predictors within region across people.
> 
> On Thu, Sep 27, 2012 at 3:29 PM, Tobias Pfaff
> <[email protected]> wrote:
> > Dear Statalisters,
> >
> > I would kindly ask you for comments on an instrumental-variables
> regression
> > with (two-way) clustered standard errors, which is a 
> challenge for me.
> > I'm afraid that the whole problem cannot be written in just 
> a few lines.
> > Below is the whole story (which is hopefully interesting to 
> some of you).
> >
> > Any help is greatly appreciated!
> >
> > Now the setting:
> >
> > Unbalanced individual panel data set, single country
> > Obs.: 170,000
> > Individuals: 28,000
> > Regions: 14
> > Years: 9
> > Dependent variable measured on the individual level
> > Independent variable of interest (focusvar) measured on the 
> regional level
> > Further control variables: 10, all at the individual level, 
> plus region
> and
> > year dummies (20 dummies)
> >
> > I use individual fixed effects and I cluster on the 
> individual level to
> > control for correlation of the errors over time and get the 
> result that my
> > focus variable is significant:
> > -xtivreg2 depvar focusvar controlvars, fe cluster(pid)-
> >
> > My focus variable is aggregated at a higher level (region) than the
> > dependent variable (individual), and I know from Moulton 
> (1990) that my
> > standard errors can be biased downwards dramatically if I 
> do not cluster
> at
> > the regional level. Additionally, Donald and Lang (2007) 
> say that without
> > clustering on the regional level, I dramatically overstate the
> significance
> > of the coefficients. Therefore, I use two-way clustering on 
> the individual
> > and on the regional level:
> > -xtivreg2 depvar focusvar controlvars, fe cluster(pid region)-
> >
> > Now my focus variable is insignificant. However, the number 
> of clusters is
> > small (14), which again leads to biased results (Donald and 
> Lang 2007).
> > Cameron et al. (2011) tell me that "With a small number of 
> clusters the
> > cluster-robust standard errors are downwards biased" (p. 
> 414). Since my
> > focus variable is already insignificant, I would expect the 
> coefficient to
> > be even more insignificant, if I would correct for the bias 
> induced by the
> > small number of clusters, and I conclude that I find no evidence for
> > significance.
> >
> > Now comes the challenge (as if it has not yet been enough):
> > I want to do an IV regression to make sure that my results are not
> > influenced by endogeneity bias. I found a variable on the 
> regional level
> > which is theoretically a fine instrument for my regional 
> focus variable.
> The
> > correlation between the focus variable and the instrument is .60.
> >
> > I now estimate the IV model with two-way clustered standard errors:
> > -xtivreg2 depvar (focusvar = instrumentvar) controlvars, fe 
> cluster(pid
> > region) first-
> >
> > The size of the coefficient of my focus variable has decreased. The
> standard
> > errors have increased drastically, and the coefficient is by far not
> > significant. In the first-stage regression, the instrument is not
> > significant. The tests say that the instrument is weak and 
> I cannot reject
> > the null of underidentification.  I interpret this as 
> evidence that I have
> a
> > bad instrument or that my focus variable is not endogenous.
> >
> > However, a different picture appears when I only cluster at 
> the individual
> > level:
> > -xtivreg2 depvar (focusvar = instrumentvar) controlvars, fe 
> cluster(pid)
> > first-
> >
> > The standard errors of my focus variable are still much 
> larger than the
> > non-IV estimates, but smaller compared to IV with two-way 
> clustering. The
> > focus variable is again not significant. The instrument is highly
> > significant in the first-stage regression. The tests 
> indicate that the
> > hypotheses of a weak instrument and of underidentification can be
> rejected.
> > I would interpret this as evidence that my instrument is 
> valid and that my
> > focus variable is endogenous.
> >
> > Conclusion:
> > My interpretation is that the results generally suggest 
> that my focus
> > variable is not significant.
> >
> > Open questions:
> > Is my interpretation wrong?
> > Is my instrument good or bad - should I trust the results 
> from the one-way
> > or two-way clustering for the IV approach?
> > In case I want to cluster on the regional level and correct 
> for the bias
> due
> > to a small number of clusters, I could use 
> wild-bootstrapping as proposed
> by
> > Cameron et al. (2011), but does that work for IV as well?
> >
> > Thanks very much for any clarification,
> > Tobias
> >
> > Cited literature:
> > Cameron, Gelbach, Miller (2008), Bootstrap-Based Improvements for
> Inference
> > with Clustered Errors. The Review of Economics and 
> Statistics, 90 (3),
> > 414-427.
> > Donald, Lang (2007), Inference with 
> Difference-in-Differences and Other
> > Panel Data. The Review of Economics and Statistics, 89 (2), 221-233.
> > Moulton (1990), An Illustration of a Pitfall in Estimating 
> the Effects of
> > Aggregate Variables on Micro Units. The Review of Economics and
> Statistics,
> > 72 (2), 334-338.
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Problem with IV regression and two-way clustering
  - From: "Schaffer, Mark E" <[email protected]>

Prev by Date: st: Difference-in-Difference of Binary Outcomes with Margins
Next by Date: RE: st: command for penalized MLE using a complex survey data?
Previous by thread: RE: st: Problem with IV regression and two-way clustering
Next by thread: RE: st: Problem with IV regression and two-way clustering
Index(es):
- Date
- Thread