[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: fixed effects with clustering when the number of levels of variable to be absorbed exceeds number of clusters

From	"Schaffer, Mark E" <[email protected]>
To	<[email protected]>
Subject	RE: st: fixed effects with clustering when the number of levels of variable to be absorbed exceeds number of clusters
Date	Thu, 2 Mar 2006 12:56:34 -0000
Daniel,

To be honest, I'm not sure about the answer to your question.

Rather than me hazarding a guess, maybe someone else on the list would
like to take a stab at it...?

--Mark 

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> Daniel Simon
> Sent: 01 March 2006 18:20
> To: [email protected]
> Subject: RE: st: fixed effects with clustering when the 
> number of levels of variable to be absorbed exceeds number of clusters
> 
> thanks a lot, Mark. once again, extremely helpful. one last 
> question: If I'm only interested in testing significance of a 
> handful of regressors is this less of a concern?
> Thanks again for your thoughtful replies. Daniel
> 
> At 04:49 PM 3/1/2006 +0000, you wrote:
> >Daniel,
> >
> >This is a tricky question, at least for me, and I don't know the 
> >complete answer.
> >
> >The situation you describe is definitely a problem if you 
> want to test 
> >lots of parameter restrictions.  If you try, say, to test the joint 
> >significance of all your regressors, you will fail, because you have 
> >more (restrictions on) regressors than clusters.  You will probably 
> >also see that the F statistic automatically reported by areg 
> or xtreg 
> >is missing and highlighted in blue, and if you click on it 
> you'll get a 
> >longish discussion that includes the following:
> >
> >"There is no mechanical problem with your model, but you need to 
> >consider carefully whether any of the reported standard errors mean 
> >anything.  The theory that justifies the standard error 
> calculation is 
> >asymptotic in the number of clusters, and we have just 
> established that 
> >you are estimating at least as many parameters as you have clusters.
> >
> >Putting that concern aside, the model test statistic issue 
> is that you 
> >cannot simultaneously test that all coefficients are zero 
> because there 
> >is insufficient information.  You could test a subset, but 
> not all, and 
> >so Stata refuses to report the overall model test statistic."
> >
> >The full help message is available as -help j_robustsingular-.
> >
> >However ... there is some ambiguity in the statement above, since it 
> >implies that it's *possible* that none of the SEs mean anything.  I 
> >used to think this was automatically the case if the cluster-robust 
> >var-cov matrix is not full rank, but now I'm not sure.  It 
> may be the 
> >case that, for example, you can still get valid tests of one 
> or a few 
> >coefficients even if you can't test them all jointly.  I've been 
> >meaning to go searching through the literature to find the 
> references 
> >on this but haven't had the time....
> >
> >Cheers,
> >Mark
> >
> > > -----Original Message-----
> > > From: [email protected]
> > > [mailto:[email protected]] On Behalf Of Daniel 
> > > Simon
> > > Sent: 01 March 2006 15:54
> > > To: [email protected]
> > > Subject: RE: st: fixed effects with clustering when the number of 
> > > levels of variable to be absorbed exceeds number of clusters
> > >
> > > Mark - thanks, this is very helpful, as usual. Now, I have a 
> > > follow-up. If, in addition to the set of fixed effects that I am 
> > > absorbing, I have another set of dummies that I am including 
> > > manually with i. and there about as many of these i.fixed 
> effects as 
> > > there are clusters, then this will pose a problem. Is 
> that correct? 
> > > For example, if in my individual fixed effects model 
> where I cluster 
> > > on state, I also want to include fixed effects for age (e.g. a 
> > > separate dummy for each value of age in years in my 
> dataset), and I 
> > > have forty different age dummies,  then the number of age 
> dummies is 
> > > close to the number of clusters.  In this situation, is 
> there some 
> > > way to assess whether the estimates of the std errors are 
> > > problematic? and, is there some alternative way to proceed?
> > >
> > > Thanks again. Daniel
> > >
> > > At 03:19 PM 3/1/2006 +0000, you wrote:
> > > >Daniel,
> > > >
> > > >What you need to be aware of is that the asymptotics 
> justifying the 
> > > >cluster-robust estimator requires the number of clusters to
> > > go off to
> > > >infinity.  I don't think Austin's comment is quite right, at
> > > least in
> > > >the context you've cited it.  The number of fixed effects
> > > can be much
> > > >bigger than the number of clusters, and that won't by 
> itself cause 
> > > >a problem - after all, the fixed effects are not actually
> > > being estimated.
> > > >What *will* cause problems is if you have very few 
> clusters, esp. 
> > > >if compared to the number of parameters that you *are* 
> estimating.  
> > > >In your example, you want to cluster by state.  50 is not very
> > > far on the
> > > >way to infinity, but maybe it's enough for your purposes.
> > > But if you
> > > >also have lots of parameters that you want to test, then you
> > > will start
> > > >running into serious problems (nb: the rank of the 
> cluster-robust 
> > > >var-cov matrix is equal to the number of clusters minus the
> > > number of
> > > >estimated parameters).
> > > >
> > > >Hope this helps.
> > > >
> > > >--Mark
> > > >
> > > > > -----Original Message-----
> > > > > From: [email protected]
> > > > > [mailto:[email protected]] On Behalf Of 
> > > > > Daniel Simon
> > > > > Sent: 01 March 2006 15:02
> > > > > To: [email protected]
> > > > > Subject: Re: st: fixed effects with clustering when 
> the number 
> > > > > of levels of variable to be absorbed exceeds number 
> of clusters
> > > > >
> > > > > Sorry - I made a mistake in the subject line of my last
> > > message. It
> > > > > is now correct. Daniel
> > > > >
> > > > > At 09:59 AM 3/1/2006 -0500, you wrote:
> > > > > >Hi Austin - thanks for pointing out that "the number of
> > > levels of
> > > > > >the
> > > > > >absorb() variable should not exceed the number of clusters."
> > > > > I have two
> > > > > >questions about this: (1) I assume that the same 
> holds true for 
> > > > > >xtreg,fe with clustering (given that this yields identical
> > > > > std errors
> > > > > >to areg with clustering). Is this assumption correct? (2)
> > > > > Does anyone
> > > > > >have suggestions for the most efficient way to estimate
> > > > > fixed-effects
> > > > > >models with clustering when there are thousands of fixed 
> > > > > >effects but clustering occurs on a variable with many fewer 
> > > > > >units? For
> > > > > example, if
> > > > > >I have a panel dataset tracking thousands of individuals
> > > > > over time and
> > > > > >I want to examine the impact of a state policy variable,
> > > > > then I would
> > > > > >want to estimate a model with individual fixed effects but I
> > > > > would also want to cluster by state.
> > > > > >What would be a sensible way to proceed in this situation?
> > > > > >
> > > > > >Thanks. Daniel
> > > > > >
> > > > > >At 02:06 PM 2/28/2006 -0500, you wrote:
> > > > > >>Perhaps I should ignore this question in the same way you
> > > > > have ignored
> > > > > >>the advice in the Statalist FAQ on how to write a
> > > > > well-formed question
> > > > > >>(in particular, you give no indication what command you
> > > > > used or what
> > > > > >>error message you got, much less show us the output), but
> > > > > you should
> > > > > >>certainly read:
> > > > > >>       -help xtreg- -help xtdata- and -help areg- for
> > > > > starters.  Note
> > > > > >>also that you may want to cluster on id, assuming your
> > > > > fixed effects
> > > > > >>are individual id and year effects, to allow for
> > > arbitrary serial
> > > > > >>correlation within panel, and -cluster- implies -robust-.
> > > > > But see the
> > > > > >>various FAQs on the subject, and such advice as 
> appears in the 
> > > > > >>relevant help files, e.g.
> > > > > >>   Note: Exercise caution when using the cluster()
> > > option with areg.
> > > > > >>         The effective number of degrees of freedom for the
> > > > > robust variance
> > > > > >>         estimator is (n_g - 1), where n_g is the number of
> > > > > clusters.  Thus
> > > > > >>         the number of levels of the absorb() variable
> > > > > should not exceed the
> > > > > >>         number of clusters.
> > > > > >>
> > > > > >>On 2/28/06, Yasmine Kent <[email protected]> wrote:
> > > > > >> > Hi,
> > > > > >> >
> > > > > >> > Apologies if this is a basic question...
> > > > > >> >
> > > > > >> > I would like to obtain ROBUST standard errors and
> > > > > t-statistics in a
> > > > > >> > panel data regression that I am running (with 2-way
> > > > > fixed effects).
> > > > > >> > The 'robust'
> > > > > >> > command does not appear to work with panel data, it
> > > > > gives an error
> > > > > >> > message. Theoretically, I thought that it should be
> > > > > possible to get
> > > > > >> > these. Is there another command I should use instead? (I
> > > > > am using
> > > > > >> > Stata 8).
> > > > > >> >
> > > > > >> > Thank you!
> > > > > >> > Yasmine
> > > > > >>
> > > > > >>*
> > > > > >>*   For searches and help try:
> > > > > >>*   http://www.stata.com/support/faqs/res/findit.html
> > > > > >>*   http://www.stata.com/support/statalist/faq
> > > > > >>*   http://www.ats.ucla.edu/stat/stata/
> > > > > >
> > > > > >Daniel Simon
> > > > > >Assistant Professor
> > > > > >Department of Applied Economics and Management Cornell 
> > > > > >University
> > > > > >(607) 255-1626
> > > > > >*
> > > > > >*   For searches and help try:
> > > > > >*   http://www.stata.com/support/faqs/res/findit.html
> > > > > >*   http://www.stata.com/support/statalist/faq
> > > > > >*   http://www.ats.ucla.edu/stat/stata/
> > > > >
> > > > > Daniel Simon
> > > > > Assistant Professor
> > > > > Department of Applied Economics and Management Cornell 
> > > > > University
> > > > > (607) 255-1626
> > > > >
> > > > > *
> > > > > *   For searches and help try:
> > > > > *   http://www.stata.com/support/faqs/res/findit.html
> > > > > *   http://www.stata.com/support/statalist/faq
> > > > > *   http://www.ats.ucla.edu/stat/stata/
> > > > >
> > > > >
> > > >
> > > >*
> > > >*   For searches and help try:
> > > >*   http://www.stata.com/support/faqs/res/findit.html
> > > >*   http://www.stata.com/support/statalist/faq
> > > >*   http://www.ats.ucla.edu/stat/stata/
> > >
> > > Daniel Simon
> > > Assistant Professor
> > > Department of Applied Economics and Management Cornell University
> > > (607) 255-1626
> > >
> > > *
> > > *   For searches and help try:
> > > *   http://www.stata.com/support/faqs/res/findit.html
> > > *   http://www.stata.com/support/statalist/faq
> > > *   http://www.ats.ucla.edu/stat/stata/
> > >
> > >
> >
> >*
> >*   For searches and help try:
> >*   http://www.stata.com/support/faqs/res/findit.html
> >*   http://www.stata.com/support/statalist/faq
> >*   http://www.ats.ucla.edu/stat/stata/
> 
> Daniel Simon
> Assistant Professor
> Department of Applied Economics and Management Cornell University
> (607) 255-1626 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: st: fixed effects with clustering when the number of levels of variable to be absorbed exceeds number of clusters
  - From: "Austin Nichols" <[email protected]>
Prev by Date: Re: st: use of *OR* | in 'replace'
Next by Date: st: Updates to oglm, gologit2
Previous by thread: Re: st: use of *OR* | in 'replace'
Next by thread: Re: st: fixed effects with clustering when the number of levels of variable to be absorbed exceeds number of clusters
Index(es):
- Date
- Thread