Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: synthetic ZINB
From
Nick Cox <[email protected]>
To
[email protected], [email protected]
Subject
Re: st: RE: synthetic ZINB
Date
Mon, 6 Jun 2011 07:10:24 +0100
Joe and other Statalist members:
Please remember our longstanding request _not_ to post attachments.
Most members of the list will have received these as unintelligible
gibberish.
In this case, the two .do files posted are in fact readable via
http://www.stata.com/statalist/archive/2011-06/msg00184.html
but not via the Harvard archives.
Nick
On Mon, Jun 6, 2011 at 12:55 AM, <[email protected]> wrote:
> oops. In the zinb_syn.do code I neglected to amend the label just prior to
> the synthetic zinb model at the end. The hurdle model caption was retained.
> I am attaching the correct zinb._syn.do program, and a similar program for
> synthetic ZIP. The code
> runs OK, it was simply the caption. My apologies.
>
> Joseph Hilbe
>
>
>
> -----Original Message-----
> From: jhilbe <[email protected]>
> To: statalist <[email protected]>
> Sent: Sun, Jun 5, 2011 4:31 pm
> Subject: RE: synthetic ZINB
>
> Statalisters:
>
> I happened to see a discussion of synthetic ZINB data on the StataList
> digest today. There is an entirely different way to approach this - one
> that creates a full synthetic ZINB model. I wrote about creating
> synthetic models in the first volume of the 2010 Stata Journal, and
> discuss them much more fully in my recently published, second edition
> of "Negative Binomial Regression" (Cambridge University Press, 572
> pages). The book discusses most every count model in the literature,
> providing both Stata and R code for examples. Output is given in Stata,
> except for the final chapter on Bayesian NB models. I also develop a
> variety of synthetic count models where it is simple to write your
> chosen synthetic predictors as continuous predictors, as binary, or as
> multilevel categorical. You may employ as many predictors as you wish,
> from an intercept-only model to one with more than 10 predictors if you
> wish. The user specifes the desired coefficients for all predictors, as
> well as levels of predictor. For NB models you also declare the value
> of alpha you wish to model.
>
> It was quite simple to convert the synthetic NB2-logit hurdle model I
> give in the book to a zero-inflated NB model, with a logit binary
> component. I am attaching it to this message, but provide it below my
> signature as well, together with a sample run. Note where the
> coefficient values are defined in the comment above active code, but
> the actual values are given in the code where indicated. I made the
> predictors here be simple normal variates, but more complex structures
> are described in the book, and in the Stata Journal article.
>
> I find synthetic models like this very useful for testing model
> assumptions.
>
> Best, Joseph Hilbe
>
> ZINB_SYN.DO
> ==================================================
> * Zero inflated Negative binomial with logit as binary component
> * Joseph Hilbe 5Jun2011 zinb_syn.do
> * LOGIT: x1=-.9, x2=-.1, _c=-.2
> * NB2 : x1=.75, n2=-1.25, _c=2, alpha=.5
> clear
> set obs 50000
> set seed 1000
> gen x1 = invnorm(runiform())
> gen x2 = invnorm(runiform())
> * NEGATIVE BINOMIAL- NB2
> gen xb = 2 + 0.75*x1 - 1.25*x2
> gen a = .5
> gen ia = 1/a
> gen exb = exp(xb)
> gen xg = rgamma(ia, a)
> gen xbg = exb * xg
> gen nby = rpoisson(xbg)
> * BERNOULLI
> gen pi =1/(1+exp(-(.9*x1 + .1*x2+.2)))
> gen bernoulli = runiform()>pi
> gen zy = bernoulli*nby
> rename zy y
> * NB2-LOGIT HURDLE
> zinb y x1 x2, inf(x1 x2) nolog
> =================================
>
>
>
> Zero-inflated negative binomial regression Number of obs =
> 50000
> Nonzero obs =
> 19181
> Zero obs =
> 30819
>
> Inflation model = logit LR chi2(2) =
> 24712.97
> Log likelihood = -88361.63 Prob > chi2 =
> 0.0000
>
> -------------------------------------------------------------------------
>
> -----
> y | Coef. Std. Err. z P>|z| [95% Conf.
> Interval]
> -------------+-----------------------------------------------------------
>
> -----
> y |
> x1 | .7407043 .0066552 111.30 0.000 .7276604
> .7537483
> x2 | -1.249479 .0067983 -183.79 0.000 -1.262804
> -1.236155
> _cons | 1.996782 .0069297 288.15 0.000 1.9832
> 2.010364
> -------------+-----------------------------------------------------------
>
> -----
> inflate |
> x1 | .9047498 .0141011 64.16 0.000 .8771121
> .9323875
> x2 | .095477 .0125229 7.62 0.000 .0709326
> .1200213
> _cons | .2031966 .0121878 16.67 0.000 .179309
> .2270841
> -------------+-----------------------------------------------------------
>
> -----
> /lnalpha | -.6778044 .0153451 -44.17 0.000 -.7078803
> -.6477286
> -------------+-----------------------------------------------------------
>
> -----
> alpha | .5077305 .0077912 .4926874
> .5232329
> -------------------------------------------------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/