On Mon, Aug 31, 2009 at 4:34 PM, Schaffer, Mark E<[email protected]> wrote:
>> Now, the meaning of -robust- standard errors after -xtmixed-
>> might be a somewhat of a mystery.
> Is this analogous to the use of -robust- in a probit estimation? I
> remember reading a discussion by Dixit somewhere (I think it was in the
> book he did for the World Bank about 10 years ago) about how allowing
> for heteroskedasticity in a probit model makes no sense, because if the
> variance isn't constant, a probit is not estimating anything
> consistently (I think this is what the argument was ... several layers
> of brain dust are getting in the way).
Yes, that's probably a similar interpretation problem. I remember that
Sophia and Anders were also dismissive of heteroskedasticity in probit
models (Rabe-Hesketh and Skrondal, 2004, private communication :)).
> But estimating a probit model
> with cluster-robust *does* make sense, because within-group correlation
> or other failures of independence doesn't imply a probit is useless,
> does mess up the usual classical SEs, and doesn't mess up cluster-robust
> SEs (with enough assumptions etc. etc.)
With binary dependent variable models, there are several
interpretations. One is the econometric one: there is an underlying
utility u=xb+e, and depending on the shape of the error term, you
could have probit, logit, cloglog. And if we have several utilities,
and e's come from a convenient extreme value distribution, you have
-mlogit-. Hence you can talk about variance of that error term e, and
that your identified combination is b/sigma, etc. Another
intpreretation coming from statistics is that you are just modeling
the probability of a 0/1 event. There are no variances involved there,
although you could play with the functional form a little bit to
improve fit. More distant fields like machine learning would probably
say, heck with the likelihood and iterative maximization, let's fit a
support vector machine model to this (and not worry about standard
errors at all).
If you are thinking in econometric terms, and are willing to assume
your errors are correlated, then -cluster- correction does make sense.
Although if correlations are sort of constant across clusters, you can
build a more efficient estimator, I imagine -- similar argument as in
your paper regarding heteroskedastic GMM for linear models. In
general, the correlations of the error terms will be blurred by the
link function and its derivatives that enter the estimating equations.
-xtlogit- or -xtmelogit- probably takes a better account of this
effect than -logit, cluster-.
I would tend to think that once we get down to estimating equations,
-cluster- corrections for -probit/logit- would be making sense in
statistical interpretation of those models, too. Although in most
cases statisticians would prefer to model that correlation directly
with GEE or something of the kind. You see, statisticians are less
concerned with endogeneity than econometricians are, and would prefer
-xtreg, re- over -xtreg, fe- almost any day for efficiency reasons. In
the end, most of -xt*- applications in statistics are some sort of
clinical trials where endogeneity issues are dealt with by
randomization procedures where applicable. Although here I am stepping
into the biostat land I am less familiar with.
>> With -regress-, the
>> -robust- option is correcting for heteroskedasticity: you
>> believe you modeled the first moments right, but not sure
>> about higher order moments (the second moments, in this
>> case). That's what Mark said: the model is bad, but not as
>> bad as to kill the point estimates. If you have
>> heteroskedasticity, your -xtmixed- model is likely wrong in
>> its variance part, and the variance parameters may not
>> necessarily correspond to well-defined population parameters.
>> If so, what does the inference on these point estimates do?
> if I have, say, within-group correlation that the -xtmixed- model
> doesn't model properly, does cluster-robust help? For example, say my
> -xtmixed- model is a lot better than nothing (in efficiency terms) but
> there is still within-group dependence that is not properly modelled,
> and I suspect this. Would this be a reasonable rationale to want to use
> cluster-robust?
I would say it would be making good sense if -cluster- is a level
higher than the highest level modeled by -xtmixed-. Say you have
students nested in classrooms nested in schools, and the latter are
sampled by county. It might make sense substantively to build a model
around students, classrooms and schools, but counties don't belong in
that substantive model. You might still want to correct for the PSU in
the sampling procedure with -cluster- option, then.
Whether correcting for clusters at the same level as you do modeling
is making sense, I do not know. Let's think about an analogy with
linear regression again. Suppose you model heteroskedasticity in your
regression with a hypothetical -hetreg- command (is there really a
command to do it, BTW? I remember examples in [ML] book that show how
to build such a model, but I don't really know if there is a
stand-alone command that does it). Would you want to add -hetreg ,
robust- to it, then, saying something along the lines of "I don't
really know if my heteroskedasticity model is any good" ? I would
probably have some reservations about it, and would at least try some
simulations to see how badly the model performs with and without the
-robust- option. I would imagine that with a misspecified variance
model, the performance in confidence intervals would be -regress,
robust- better than -hetreg, robust- better than -hetreg- better than
-regress-, although I won't bet much in this race.
Disclaimer: again, all said is my thinking aloud.
--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/