Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Hoaglin <dchoaglin@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: which -cmp- option to use for poisson model with count data? |
Date | Thu, 3 May 2012 09:30:09 -0400 |
Laura, That information on the dependent variable is a helpful start. A natural next question is what the frequency distribution of the counts looks like. When explanatory variables are involved, one can't necessarily judge by looking only at that frequency distribution, but it is a reasonable place to start. If 0 is substantially more frequent than would be compatible with a Poisson distribution (judging by the nonzero counts), the data may come from a two-part process. A person may decide whether to seek advice from any expert; a logistic regression model might be appropriate for that part. Then, among people who decide to seek advice, the number of experts varies (according to a Poisson model or a negative binomial model). These are examples of a type of two-part model known as a hurdle model. Another type of model is the zero-inflated Poisson or zero-inflated negative binomial. These are mixture models in which a count of 0 can come either from deciding not to seek expert advice or from deciding to seek expert advice but not actually consulting any experts (yet?). In the corresponding hurdle models, the Poisson or negative binomial distribution would be truncated at 0 (i.e., a count of 0 would not come from the Poisson or the negative binomial). I hope this helps. David Hoaglin > It's the number of experts a person has sought advice from, so I think > there is no upper limit, like number of children. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/