Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: which -cmp- option to use for poisson model with count data?
From
"David Roodman ([email protected])" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: which -cmp- option to use for poisson model with count data?
Date
Mon, 7 May 2012 02:24:47 +0000
It is my understanding that the Poisson model is for counts of rare events, such as emergency room admissions--something we certainly wouldn't expect to be normally distributed if truly rare. On the other hand, as the events become more common (imagine emergency room admissions at a big hospital in a big city), the distribution will converge to the normal distribution, as it must by the Central Limit Theorem. That is what I meant when I said that cmp would not be appropriate for low counts, but could be OK for high counts.
However, it can be appropriate for low counts in other contexts. It is all a question of what we believe about the data generating process. For example, number of kids in a family. One can reasonably hypothesize that propensity to have more or fewer kids is an unobserved, continuous variable based on the normal distribution. It manifests as 0, 1, 2, etc. Then ordered probit is entirely appropriate.
(The funny thing here is that you could argue that which model you use for number of kids could depend on whether pregnancies are planned or unplanned. If unplanned, I suppose the Poisson model is right!)
--David
--------
From Nick Cox <[email protected]>
To [email protected]
Subject Re: st: which -cmp- option to use for poisson model with count data?
Date Thu, 3 May 2012 09:23:26 +0100
I am not annoyed. I am concerned that time is being wasted because (a)
we don't have enough information to give you good replies and (b) you
don't seem to understand much of the advice being given you.
David said
"Unless the counts are high, count data can't be realistically modeled
as the outcome of a single underlying process consisting of a linear
functional plus a normally distributed error."
That was his advice about using -cmp-. He's the author and an expert.
If you want to go against his advice, that's your call, but in the
only example you have given, your counts have a maximum of 5.
Whether your count data can be treated as ordered probit is something
on which experts have different tastes and judgements. Counts that can
go 0,...,5 could be treated as graded variables 0 < 1 < 2 < 3 < 4 < 5.
I can't comment on the example you refer to, as I have not studied it.
On terminology: I wouldn't describe a counted variable as a
categorical variable, although counted variables do certainly appear
in categorical data analysis texts.
Nick
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/