Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Zero Inflated Negative Binomial model
From
David Hoaglin <[email protected]>
To
[email protected]
Subject
Re: st: Zero Inflated Negative Binomial model
Date
Sun, 22 Jan 2012 09:26:34 -0500
The exploratory step that I sketched earlier is closely related to a
hurdle model. A brief discussion in the book by Agresti (2010) cites
papers by Saei et al. (1996) and Min and Agresti (2005). For the
nonzero categories a cumulative logit model might work (as in ordinal
logistic regression), and you could try other cumulative link
functions.
References
Agresti, A (2005). Analysis of Ordinal Categorical Data, second edition. Wiley.
Min Y, Agresti A (1996). Random effect models for repeated measures of
zero-inflated count data. Statistical Modeling 5:1-19.
Saei A, Ward J, McGilchrist CA (1996). Threshold models in a methodone
programme evaluation. Statistics in Medicine 15:2253-2260.
David Hoaglin
On Sat, Jan 21, 2012 at 8:06 AM, David Hoaglin <[email protected]> wrote:
> Eugene,
>
> You are correct that using a ZINB model would be problematic. The NB
> distribution applies to counted data (i.e., it is possible for any
> nonnegative count to occur in the outcome variable). When you have
> only categories, that requirement is not satisfied, no matter what
> value you choose to represent each category.
>
> I don't know whether the ordinal logit model has a zero-inflated
> version (I have not searched). Here "zero-inflated" would mean that
> the first category is inflated, since numerical values associated with
> the ordered categories are only labels. If someone has worked out
> such a model, you would still need to determine whether, in your data,
> the assumption of proportional odds is reasonable. You could try an
> ordinal logistic regression model with your data as they stand, and
> see what happens.
>
> As an exploratory step, you could fit a binary logit model to "0
> times" versus "1 or more times"; that would address the question of
> crossing the threshold into self-injurious behavior. You could then
> work with only the nonzero categories and dichotomize the outcome
> variable at each of the category boundaries (or some of them) and fit
> a binary logit model to each dichotomized outcome. Comparison of the
> coefficients on the predictor variables among those models would give
> you an indication of whether the proportional odds model is
> reasonable.
>
> You didn't describe the sorts of predictor variables that you have.
> Other analytic approaches may be possible.
>
> David Hoaglin
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/