Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Hoaglin <dchoaglin@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: zero-inflated analyses: when do you decide that is zero-inflated? |
Date | Tue, 16 Jul 2013 09:24:47 -0400 |
Dear Cris, Since you have the actual size of the skin reaction, a two-part model seems a good choice. It would be of interest to compare the result of using actual sizes that are < 3 mm with the result of recoding those sizes to 0. If those results differ in interesting ways, you could see what happens with some lower thresholds than 3 mm. In the regression part of the two-part model, you may want to consider using a transformed scale for the size. For example, the variability in size may be greater for larger wheals. If so, the square-root scale or the log scale may be appropriate (either by actual transformation or by a version of generalized linear models known as quasi-likelihood, which can be done, as I understand it, with the -poisson- command). To get a graphical indication of whether a set of frequencies resembles a Poisson distribution (or, for example, has excess zeros), you could try the "Poisonness plot" (Hoaglin 1980, Hoaglin and Tukey 1985 --- pardon the shameless plug). The basic version would be easy to do. The 1985 chapter discusses a similar plot for negative binomial distributions, once one chooses a value for one of the parameters. David Hoaglin Hoaglin, D.C. (1980). A Poissonness plot. The American Statistician 34:146-149. Hoaglin D.C. and Tukey J.W. (1985). Checking the shape of discrete distributions. In Exploring Data Tables, Trends, and Shapes (D.C. Hoaglin, F. Mosteller, and J.W. Tukey, eds.). New York: Wiley, pp. 345-416. On Tue, Jul 16, 2013 at 5:55 AM, Cris Dogaru (Oregon State University) <statamplus@gmail.com> wrote: > Dear David, > I see what you are saying, and you are actually right. Theoretically I > can still consider it a truncated version (we could have administered > 10 or 20 skin prick test to separate allergens), but indeed, > conceptually my outcome is not a count variable (counting events), but > rather a set of indicator variables for a latent construct (atopy or > sensitization); this leaving aside that the decision for a "positive" > test is arbitrary (skin reaction is 3mm in diameter or larger). The > tests are indeed associated, as one would actually expect. From the > literature (using factor analysis technique), they tend to cluster > (indoor, outdoor, food, inhaled, etc allergens). > > I will settle, probably, for a two-part model, as Peter Lachenbruch > suggests, but I will do it for each test individually, taking the > actual size of the skin reaction, in mm. There's plenty of zeros (and > I can recode those <3 mm to 0 as well, to stick with the commonly used > threshold), so I will have a two-part model with a logit/regress > combination (I can use the user-written tpm program). > > One of the co-authors suggested to analyze "number of positive tests", > and that got me into the negative binomial/Poisson approaches. An > ordinal logit model seems more appropriate indeed. > > Many thanks > > Cristian Dogaru * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/