|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: Dependent var is a proportion, with large spike in .95+
From |
"Verkuilen, Jay" <[email protected]> |
To |
<[email protected]> |
Subject |
st: RE: Dependent var is a proportion, with large spike in .95+ |
Date |
Wed, 3 Sep 2008 17:43:25 -0400 |
Dan Weitzenfeld wrote:
>>That describes my situation exactly: I have a marked spike in my
histogram at the top bin, roughly .95 - 1. I am wondering how to
account for this.>>
I am working on a model that combines zero inflation and a beta
regression, essentially a combination of a beta regression for the
continuous part and a logistic (or probit) for the boundary. It's not
done in Stata (yet... but don't hold your breath). So far we've found it
to be fairly tricky to implement--as zero inflation models tend to
be---but it does work.
Also, depending on the nature of your DV, there is little harm in
"cheating" your observations away from 0 by using the transformation:
Y_new = eps/2 + (1 - eps/2)*Y_old
where eps > 0 is a small constant, e.g., .001. The beta likelihood is
relatively insensitive to such perturbations (while other likelihoods
are not).
IMO, the real question is the nature of the zeros, as a recent post by
Nick Cox makes plain. If the zero is a "real" one and means that there's
something qualitatively different than something slightly less than 0
then you need an inflation model. If not, cheating often works.
Whoops on rereading I see you have a sampling one. Well, same idea.
JV
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/