[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Modeling an independent variable with a very high data density at x=0

From	Allan Garland <[email protected]>
To	[email protected]
Subject	st: Modeling an independent variable with a very high data density at x=0
Date	Fri, 05 Jun 2009 19:40:58 -0700

I'm doing a logistic regression using a non-negative, continuous independent variable X, for which about 60% of cases have X=0.  It seems to me that just including X in the model is problematic, since it is likely that many cases with Y=0 and many others with Y=1 will have X=0.  I can think of 2 possible approaches to modeling X, but would like some feedback on them, and any other thoughts on how to handle this situation.
 a) Divide X into m categories and represent it with m-1 dummy variables in the model.
b) Include X in the model, and also include a binary variable Z such that Z=1 when X=0 and Z=0 otherwise.  Then the effect of X=0 is given by the coefficient of Z, and the effect of X>0 is purely given by the
coefficient of X itself (since then Z=0).

Allan

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: Re: Modeling an independent variable with a very high data density at x=0
  - From: "Joseph Coveney" <[email protected]>

Prev by Date: Re: st: Re: JJQ : st: Evaluatng Instrument Strenght in the Arrelano and Bond (1998) GMM System Estimator
Next by Date: st: Re: Modeling an independent variable with a very high data density at x=0
Previous by thread: Re: st: Re: JJQ : st: Evaluatng Instrument Strenght in the Arrelano and Bond (1998) GMM System Estimator
Next by thread: st: Re: Modeling an independent variable with a very high data density at x=0
Index(es):
- Date
- Thread