Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: zero-inflated analyses: when do you decide that is zero-inflated?
From
"Cris Dogaru (Oregon State University)" <[email protected]>
To
[email protected]
Subject
Re: st: zero-inflated analyses: when do you decide that is zero-inflated?
Date
Tue, 16 Jul 2013 12:04:30 +0200
... however, the question still remains; for a legitimate
Poisson/negative binomial variable, when do we decide it is
zero-inflated?
Cris
On Tue, Jul 16, 2013 at 11:55 AM, Cris Dogaru (Oregon State
University) <[email protected]> wrote:
> Dear David,
> I see what you are saying, and you are actually right. Theoretically I
> can still consider it a truncated version (we could have administered
> 10 or 20 skin prick test to separate allergens), but indeed,
> conceptually my outcome is not a count variable (counting events), but
> rather a set of indicator variables for a latent construct (atopy or
> sensitization); this leaving aside that the decision for a "positive"
> test is arbitrary (skin reaction is 3mm in diameter or larger). The
> tests are indeed associated, as one would actually expect. From the
> literature (using factor analysis technique), they tend to cluster
> (indoor, outdoor, food, inhaled, etc allergens).
>
> I will settle, probably, for a two-part model, as Peter Lachenbruch
> suggests, but I will do it for each test individually, taking the
> actual size of the skin reaction, in mm. There's plenty of zeros (and
> I can recode those <3 mm to 0 as well, to stick with the commonly used
> threshold), so I will have a two-part model with a logit/regress
> combination (I can use the user-written tpm program).
>
> One of the co-authors suggested to analyze "number of positive tests",
> and that got me into the negative binomial/Poisson approaches. An
> ordinal logit model seems more appropriate indeed.
>
> Many thanks
>
> Cristian Dogaru
>
>
>
> On Mon, Jul 15, 2013 at 8:36 PM, David Hoaglin <[email protected]> wrote:
>> Dear Cris.
>>
>> I don't think that outcome variable is a candidate for being Poisson
>> or negative binomial, either zero-inflated or not. Both the Poisson
>> distributions and the negative binomial distributions assign positive
>> probability to all nonnegative values, not just 0 through 4. Both of
>> those families of distributions have truncated versions, but the
>> process underlying your data doesn't look like it involves truncation.
>>
>> Your outcome variable is a legitimate numerical variable, but people
>> sometimes use an ordinal logit model for such data when the number of
>> values is small.
>>
>> Would it be appropriate to look at the association(s) among the
>> positives on the 4 tests? If positive reactions to the 4 allergens
>> were unrelated (i.e., independent), you could predict the numbers of
>> positives on the 4 from the marginal probabilities of a positive
>> reaction to the individual allergens. It may be instructive to list
>> the 16 possible combinations and their frequencies in your data.
>>
>> David Hoaglin
>>
>> On Mon, Jul 15, 2013 at 10:49 AM, Cris Dogaru (Oregon State
>> University) <[email protected]> wrote:
>>> Dear Stata users,
>>>
>>> I couldn't find an answer to this apparently simple question: how does
>>> one decide that a distribution is zero-inflated, so that one can use
>>> zero-inflated Poisson regression or zero-inflated negative binomial
>>> regression?
>>>
>>> More concrete: my outcome variable is number of positive skin prick
>>> tests (done for 4 allergens, therefore the number has a range 0 to 4).
>>> Here are the summary tables; is this zero-inflated?..
>>>
>>>
>>> spt_number -- number of positive (wheal>3mm) STP
>>> -----------------------------------------------------------
>>> | Freq. Percent Valid Cum.
>>> --------------+--------------------------------------------
>>> Valid 0 | 853 57.02 58.30 58.30
>>> 1 | 286 19.12 19.55 77.85
>>> 2 | 176 11.76 12.03 89.88
>>> 3 | 105 7.02 7.18 97.06
>>> 4 | 43 2.87 2.94 100.00
>>> Total | 1463 97.79 100.00
>>> Missing . | 33 2.21
>>> Total | 1496 100.00
>>> -----------------------------------------------------------
>>>
>>> . fsum spt_number
>>>
>>> Variable | N Mean SD Min Max
>>> ------------+---------------------------------------------
>>> spt_number | 1463 0.77 1.10 0.00 4.00
>>>
>>> Many thanks
>>> Cristian Dogaru
>>> ISPM, University of Bern
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/