Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Binary model with many zeros and few ones
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Binary model with many zeros and few ones
Date
Fri, 6 Jan 2012 11:33:36 +0000
Zero inflation as I understand it applies to situations in which there
is some kind of mixture of individuals who are zero for one reason and
individuals who are zero or one for another reason. For example, many
people never visit football matches and some may visit football
matches but just didn't do so during some survey period. I don't
think your description here justifies that term. Some people might
want to describe your situation as one of rare events and you might
want to Google "Gary King rare events logit". But that said, I would
certainly try -logit- or -probit- first.
Nick
On Fri, Jan 6, 2012 at 11:15 AM, Nikolaos Kanellopoulos
<[email protected]> wrote:
> I have a dataset of around 880 thousand observations and I want to measure as accurately as possible the relationship between certain variables and an event described by a binary variable. My dependent variable has very few ones (around 1.5% of the observations).
>
> My question, and I apologize in advance if this has been asked in the Statalist before, which is the best way to analyse this “zero inflated” binary variable? Is it OK to use a simple probit or logit model? Any suggestions/references are more than welcome.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/