Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: zero-inflated analyses: when do you decide that is zero-inflated?


From   "Cris Dogaru (Oregon State University)" <[email protected]>
To   [email protected]
Subject   Re: st: zero-inflated analyses: when do you decide that is zero-inflated?
Date   Tue, 16 Jul 2013 12:04:30 +0200

... however, the question still remains; for a legitimate
Poisson/negative binomial variable, when do we decide it is
zero-inflated?
Cris

On Tue, Jul 16, 2013 at 11:55 AM, Cris Dogaru (Oregon State
University) <[email protected]> wrote:
> Dear David,
> I see what you are saying, and you are actually right. Theoretically I
> can still consider it a truncated version (we could have administered
> 10 or 20 skin prick test to separate allergens), but indeed,
> conceptually my outcome is not a count variable (counting events), but
> rather a set of indicator variables for a latent construct (atopy or
> sensitization); this leaving aside that the decision for a "positive"
> test is arbitrary (skin reaction is 3mm in diameter or larger). The
> tests are indeed associated, as one would actually expect. From the
> literature (using factor analysis technique), they tend to cluster
> (indoor, outdoor, food, inhaled, etc allergens).
>
> I will settle, probably, for a two-part model, as Peter Lachenbruch
> suggests, but I will do it for each test individually, taking the
> actual size of the skin reaction, in mm. There's plenty of zeros (and
> I can recode those <3 mm to 0 as well, to stick with the commonly used
> threshold), so I will have a two-part model with a logit/regress
> combination (I can use the user-written tpm program).
>
> One of the co-authors suggested to analyze "number of positive tests",
> and that got me into the negative binomial/Poisson approaches. An
> ordinal logit model seems more appropriate indeed.
>
> Many thanks
>
> Cristian Dogaru
>
>
>
> On Mon, Jul 15, 2013 at 8:36 PM, David Hoaglin <[email protected]> wrote:
>> Dear Cris.
>>
>> I don't think that outcome variable is a candidate for being Poisson
>> or negative binomial, either zero-inflated or not.  Both the Poisson
>> distributions and the negative binomial distributions assign positive
>> probability to all nonnegative values, not just 0 through 4.  Both of
>> those families of distributions have truncated versions, but the
>> process underlying your data doesn't look like it involves truncation.
>>
>> Your outcome variable is a legitimate numerical variable, but people
>> sometimes use an ordinal logit model for such data when the number of
>> values is small.
>>
>> Would it be appropriate to look at the association(s) among the
>> positives on the 4 tests?  If positive reactions to the 4 allergens
>> were unrelated (i.e., independent), you could predict the numbers of
>> positives on the 4 from the marginal probabilities of a positive
>> reaction to the individual allergens.  It may be instructive to list
>> the 16 possible combinations and their frequencies in your data.
>>
>> David Hoaglin
>>
>> On Mon, Jul 15, 2013 at 10:49 AM, Cris Dogaru (Oregon State
>> University) <[email protected]> wrote:
>>> Dear Stata users,
>>>
>>> I couldn't find an answer to this apparently simple question: how does
>>> one decide that a distribution is zero-inflated, so that one can use
>>> zero-inflated Poisson regression or zero-inflated negative binomial
>>> regression?
>>>
>>> More concrete: my outcome variable is number of positive skin prick
>>> tests (done for 4 allergens, therefore the number has a range 0 to 4).
>>> Here are the summary tables; is this zero-inflated?..
>>>
>>>
>>> spt_number -- number of positive (wheal>3mm) STP
>>> -----------------------------------------------------------
>>>               |      Freq.    Percent      Valid       Cum.
>>> --------------+--------------------------------------------
>>> Valid   0     |        853      57.02      58.30      58.30
>>>           1     |        286      19.12      19.55      77.85
>>>           2     |        176      11.76      12.03      89.88
>>>           3     |        105       7.02       7.18      97.06
>>>           4     |         43        2.87       2.94     100.00
>>>          Total |       1463      97.79     100.00
>>> Missing .     |         33       2.21
>>> Total         |       1496     100.00
>>> -----------------------------------------------------------
>>>
>>> . fsum spt_number
>>>
>>>    Variable |        N     Mean       SD      Min      Max
>>> ------------+---------------------------------------------
>>>  spt_number |     1463     0.77     1.10     0.00     4.00
>>>
>>> Many thanks
>>> Cristian Dogaru
>>> ISPM, University of Bern
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index