Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to model a positive continuous dependent variable with many zeros?

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: How to model a positive continuous dependent variable with many zeros?
Date	Thu, 2 Jun 2011 12:47:39 -0400

Ah.. you are asking about the combination. The expected duration Y for a person with covariates X is:

E(Y|X) = P(Y>0|X)*E(Y|Y>0,X)

Where the P is from the logistic (or other) binary model and the expected value is from the survival model. However you have multiple episodes per-person, so that a single two-part model will not suffice. As you are really interested in the proportion of total time spent in seclusion, consider analyzing just that proportion directly. See Kit Baum's Stata Journal tip at http://www.scribd.com/doc/55505304/61/Stata-tip-63-Modeling-proportions.

Steve
[email protected]

On Wed, Jun 1, 2011 at 2:38 AM, Steve Samuels wrote

These are known as "two-part" or "hurdle" models, and a google search will find hundreds of references.
On Wed, Jun 1, 2011 at 2:38 AM, Adriaan Hoogendoorn

<[email protected]> wrote:

Adriaan wrote:

Thank you, Hithesh (and Maarten in a previous post), for your help.
Your help is highly appreciated.

The situation Maarten described appears exactly to be the case:
Clinic staff members try reducing total seclusion durations (at the
clinic level) by ending seclusions as soon as possible at the risk
of introducing more seclusion episodes. Total seclusion duration
(rated against the total time spent in the clinic) seems the
appropriate quantity to evaluate seclusion policies. We find that
total seclusion durations differ substantially across clinics. The
explanation clinics give for having higher total seclusion durations
than other clinics is that they claim to have “harder” patients, as
Maarten suggested.

Explaining these differences from patient characteristics (and some
clinic characteristics) is exactly what this study is about.

Your suggestion of combining the modeled zeros (from a logistic
regression, or from the Poisson as Maarten suggested) with a model for
non-zero duration (from GLM or Survival Analysis) seems very attractive.
However, I have no experience on how to do this. Do you mean: after
modeling the zeros, model the non-zeros by deleting the zeros from
the data set using the same predictors?

This would provide me with two sets of parameters. Do you think I can
use these two sets of model parameters
to estimate the total seclusion
duration for a given ward with a given set of patients?

I’ve never seen such a combined model in scientific literature – which
may well be my mistake. Do you have any references how such a combination
was applied and discussed?

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: How to model a positive continuous dependent variable with many zeros?
  - From: Adriaan Hoogendoorn <[email protected]>

References:
- Re: st: How to model a positive continuous dependent variable with many zeros?
  - From: Adriaan Hoogendoorn <[email protected]>
- Re: st: How to model a positive continuous dependent variable with many zeros?
  - From: Hitesh Chandwani <[email protected]>

Prev by Date: Re: st: predict
Next by Date: Re: st: panel data xtmixed vs xtreg
Previous by thread: Re: st: How to model a positive continuous dependent variable with many zeros?
Next by thread: Re: st: How to model a positive continuous dependent variable with many zeros?
Index(es):
- Date
- Thread