Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Left truncation in survival analysis


From   Glenn Hoetker <[email protected]>
To   [email protected]
Subject   Re: st: Left truncation in survival analysis
Date   Thu, 28 Apr 2011 17:12:58 -0500

I suspect there may not be a truly "good" answer for this problem, but I would worry that systematically omitting cases that existed before the beginning of the study could introduce bias in many cases, since it introduces the issue of missing-almost-certainly-not-completely-at-random observations.  Pick your poison, I suppose.

In an earlier paper (Hoetker, Mitchell and Swaminathan, Mgt. Science, 2007), my co-authors and I approached the problem as follows.  Our data (on the US auto industry) begins in 1918.  We have start dates for some firms that existed by that point, but not for all of them.  Of course, it was largely the smallest firms we didn't have data on, so omitting them wouldn't be random.   The earliest known starting date is 1911.  So, we generated random starting dates for the unknown ones, based on 1911 plus a uniformly distributed variable running from 0 to 6 (thus, origin in 1911-1917).  We also included a dummy variable set to one for those firms with imputed start dates.  We also used a piecewise exponential model, which some have proposed to be more resistant to this situation (see Olav Sorenson and Pino Audia (2000) American Journal of Sociology, 106(2), 424-462.)  Imperfect, we know, but in our situation, it seemed a reasonable choice relative to the alternatives.



Glenn Hoetker
Julian Simon Faculty Fellow in Business
Associate Professor (Law, Institute for Genomic Biology)
Director, Center for International Business Education and Research
Faculty Fellow, Academy for Entrepreneurial Leadership
University of Illinois
217-265-4081
[email protected]
http://www.business.uiuc.edu/ghoetker

On Apr 28, 2011, at 4:46 PM, Steven Samuels wrote:

> 
> I agree with this good advice.
> 
> 
> Steve
> [email protected]
> 
> On Apr 28, 2011, at 12:47 PM, Austin Nichols wrote:
> 
> Yigit-
> Rather than assuming a distribution for unobserved ongoing durations,
> it is probably better to use only new spells (i.e. throw out all
> truncated cases), which loses efficiency but more correctly reflects
> the uncertainty inherent in your data.
> 
> On Thu, Apr 28, 2011 at 11:32 AM, Steven Samuels <[email protected]> wrote:
>> Yigit-
>> 
>> Actually, there is an approach to left-truncation when start times are not known. It is outlined Wooldridge (2002), pp 703 & 718 (problem 20.8).  This approach requires a parametric model for the distribution of the unobserved starting times. It is not implemented in Stata.
>> 
>> In addition to describing this problem as one of "left-truncation", Wooldridge also uses the term "left censoring".  I believe that this usage is incorrect.
>> 
>> Reference: Wooldridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data. Cambridge, Mass.: MIT Press.  (There is also a 2010 edition of this fine book.)
>> 
>> Steve
>> [email protected]
>> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index