Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: -predict , reffects- after -xtmelogit-
From
Jeph Herrin <[email protected]>
To
[email protected]
Subject
Re: st: -predict , reffects- after -xtmelogit-
Date
Mon, 20 Dec 2010 14:18:25 -0500
Bobby,
This is very helpful, thanks. I understand shrinkage, but it didn't
click when I read the documentation that the distinction was made here.
So is there a way to get the mle random effects? I tried
predict ymu, mu
predict yxb, xb
gen yrwe=logit(ymu)
gen re_cons=ywre-yfix
but this doesn't agree with either sd(_cons) nor with the result
of -predict, reffects-
thanks,
Jeph
On 12/20/2010 1:09 PM, Roberto G. Gutierrez, StataCorp wrote:
Jeph Herrin<[email protected]> asks:
I am using -xtmelogit- to estimate a random effects model, and am wondering
about what is being predicted by -predict, reffects-.
Example:
clear
use http://www.stata-press.com/data/r11/bangladesh
xtmelogit c_use || district:
predict re_cons, reffects
When you use -predict, reffects- after -xtmelogit-, you obtain estimates of
the modes of the posterior distribution of the random effects given the data
and estimated parameters; see pg. 277 of [XT] xtmelogit postestimation for a
complete discussion.
Now, I would expect the standard deviation of the random effect reported by
the model:
--------------------------------------------------------
Random-effects Parameters | Estimate Std. Err.
-----------------------------+--------------------------
district: Identity |
sd(_cons) | .4995265 .0798953
--------------------------------------------------------
To be approximately the standard error of the predicted randome effects, at
the district level:
bys district : gen tolist = _n==1
sum re_cons if tolist
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
re_cons | 60 .0069783 .3787135 -.9584643 .9257698
But it seems very different, 0.4995 vs .37871. I must be missing something
obvious, but what?
The phenomenon you are seeing is known as "shrinkage". Predictions based on
the random-effects posterior distribution tend to be closer in magnitude to
zero because they are incorporating the prior information that the random
effects have mean zero. That is, if you have a relatively small cluster size
the prior information that the random effect should be zero tends to dominate.
The estimate of sd(_cons) is, in contrast, based on maximum likelihood where
all the clusters are considered jointly. Thus, prior information tends to not
dominate as much because all clusters are pooling what they have to say about
the random-effects standard deviation.
Shrinkage dimishes as cluster size gets larger. To see this, try
. clear
. set seed 1234
. set obs 100 // 100 clusters
. gen u = sqrt(2)*invnorm(uniform()) // random effects
. gen id = _n
. expand 1000 // cluster size is 1000
. gen e = log(1/runiform() - 1) // logistic errors
. gen y = (e + u)> 0 // binary response
. xtmelogit y || id:
. predict r, reffects
. bysort id: gen tolist = _n==1
. sum r if tolist
The standard deviations match much more closely -- having a cluster size of
1,000 helps!
--Bobby
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/