Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: zero-inflated analyses: when do you decide that is zero-inflated?


From   "Cris Dogaru (Oregon State University)" <[email protected]>
To   [email protected]
Subject   Re: st: zero-inflated analyses: when do you decide that is zero-inflated?
Date   Tue, 16 Jul 2013 11:55:48 +0200

Dear David,
I see what you are saying, and you are actually right. Theoretically I
can still consider it a truncated version (we could have administered
10 or 20 skin prick test to separate allergens), but indeed,
conceptually my outcome is not a count variable (counting events), but
rather a set of indicator variables for a latent construct (atopy or
sensitization); this leaving aside that the decision for a "positive"
test is arbitrary (skin reaction is 3mm in diameter or larger). The
tests are indeed associated, as one would actually expect. From the
literature (using factor analysis technique), they tend to cluster
(indoor, outdoor, food, inhaled, etc allergens).

I will settle, probably, for a two-part model, as Peter Lachenbruch
suggests, but I will do it for each test individually, taking the
actual size of the skin reaction, in mm. There's plenty of zeros (and
I can recode those <3 mm to 0 as well, to stick with the commonly used
threshold), so I will have a two-part model with a logit/regress
combination (I can use the user-written tpm program).

One of the co-authors suggested to analyze "number of positive tests",
and that got me into the negative binomial/Poisson approaches. An
ordinal logit model seems more appropriate indeed.

Many thanks

Cristian Dogaru



On Mon, Jul 15, 2013 at 8:36 PM, David Hoaglin <[email protected]> wrote:
> Dear Cris.
>
> I don't think that outcome variable is a candidate for being Poisson
> or negative binomial, either zero-inflated or not.  Both the Poisson
> distributions and the negative binomial distributions assign positive
> probability to all nonnegative values, not just 0 through 4.  Both of
> those families of distributions have truncated versions, but the
> process underlying your data doesn't look like it involves truncation.
>
> Your outcome variable is a legitimate numerical variable, but people
> sometimes use an ordinal logit model for such data when the number of
> values is small.
>
> Would it be appropriate to look at the association(s) among the
> positives on the 4 tests?  If positive reactions to the 4 allergens
> were unrelated (i.e., independent), you could predict the numbers of
> positives on the 4 from the marginal probabilities of a positive
> reaction to the individual allergens.  It may be instructive to list
> the 16 possible combinations and their frequencies in your data.
>
> David Hoaglin
>
> On Mon, Jul 15, 2013 at 10:49 AM, Cris Dogaru (Oregon State
> University) <[email protected]> wrote:
>> Dear Stata users,
>>
>> I couldn't find an answer to this apparently simple question: how does
>> one decide that a distribution is zero-inflated, so that one can use
>> zero-inflated Poisson regression or zero-inflated negative binomial
>> regression?
>>
>> More concrete: my outcome variable is number of positive skin prick
>> tests (done for 4 allergens, therefore the number has a range 0 to 4).
>> Here are the summary tables; is this zero-inflated?..
>>
>>
>> spt_number -- number of positive (wheal>3mm) STP
>> -----------------------------------------------------------
>>               |      Freq.    Percent      Valid       Cum.
>> --------------+--------------------------------------------
>> Valid   0     |        853      57.02      58.30      58.30
>>           1     |        286      19.12      19.55      77.85
>>           2     |        176      11.76      12.03      89.88
>>           3     |        105       7.02       7.18      97.06
>>           4     |         43        2.87       2.94     100.00
>>          Total |       1463      97.79     100.00
>> Missing .     |         33       2.21
>> Total         |       1496     100.00
>> -----------------------------------------------------------
>>
>> . fsum spt_number
>>
>>    Variable |        N     Mean       SD      Min      Max
>> ------------+---------------------------------------------
>>  spt_number |     1463     0.77     1.10     0.00     4.00
>>
>> Many thanks
>> Cristian Dogaru
>> ISPM, University of Bern
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index