Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Artificial censoring in survival analysis


From   Steven Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Artificial censoring in survival analysis
Date   Fri, 5 Aug 2011 17:46:31 -0400

--

Thanks for your detailed answer.  This looks like an excellent data set.  I suggest that you use all the observations that you have and censor everyone the first year they leave unemployment or 2006 for those who never became employed.  There is no advantage that I can see in ending observation after 24 months for everybody. On the contrary, there is a loss of information. 


Steve


On Aug 5, 2011, at 3:29 PM, Melaku Fekadu wrote:

Steven,


Thanks for taking your time and answering me. Here are some detail
about my data.


What kind of study generated the data. A prospective cohort?.  A
cross-section with retrospective recall?


It is a panel administrative data.


• Was the study a complex sample, so that there are weights and
clusters (PSUs)?


It has no weight.


• What is the purpose of YOUR analysis?


I am analyzing determinants of reemployment of unemployed individuals.


• What was the larger data set, if any, from which you took your
specific data.  What criteria did you use for inclusions?


I have data on one age cohort - those who were born in 1977 and were
unemployed in 1995 (at age 18). I have their monthly employment data
from 1995 to 2006, that is from age 18 to 29. Some were studying in
college during these years and some were not. For those who did not go
to college, I use for employment data from January 1995 for the
analysis. But for those who went to college I use their employment
data from the month of their college completion. The date of college
completion could be different for different individuals. This makes
the date of entry to analysis different for all individuals. This also
makes different the length of observation period for all individuals;
some have longer period of observation and some have less. Remember
that my data is restricted to 1995 to 2006. To overcome this problem I
decided to censor all at a given length of observation, say 24 months.
Because those who went to college are "too young" to experience entry
to employment compared to those who went to look for work directly.


• What is month "1"?   a calendar month, a month of an interview?  The
first month of unemployment?


Month "1" is the first month of unemployment. Month "1" could be
different for each observation.


• Did unemployment start before month "1" for everybody or some
people?  After month 1?


Month "1" is the start of unemployment for all. Some have just
finished school, some have just departed from an earlier job and start
looking for work starting from month "1".


• For those who started before month "1", do you know how long they
had been unemployed?


No unemployment before month "1".

On Fri, Aug 5, 2011 at 12:22 AM, Steven Samuels <[email protected]> wrote:
> 
> -
> I am answering your second question about -hshaz-.  There are examples of two and three mass points at the end of the -help-.  The mixture model for heterogeneity means that the unobserved log hazard is at one of those points, with locations and probabilities to be estimated.
> 
> 
> 
> For your earlier question.
> 
> I don't see a good reason for censoring individuals at 12 months because of problems in observing other individuals.  However until you describe your data more fully, then I really don't know.
> 
> 
> • What kind of study generated the data. A prospective cohort?.  A cross-section with retrospective recall?
> 
> • Was the study a complex sample, so that there are weights and clusters (PSUs)?
> 
> • What is the purpose of YOUR analysis?
> 
> • What was the larger data set, if any, from which you took your specific data.  What criteria did you use for inclusions?
> 
> • What is month "1"?   a calendar month, a month of an interview?  The first month of unemployment?
> 
> • Did unemployment start before month "1" for everybody or some people?  After month 1?
> 
> • For those who started before month "1", do you know how long they had been unemployed?
> 
> • What do you mean people were "younger" to experience the event?  Did you mean "too young" to qualify as unemployed at the start?
> 
> • Why do you have information on some people for more than 12 months but not for others?  How did observation end.
> 
> • Have you information on people who were employed but became unemployed during the study period (perhaps not in the data set you describe below.
> 
> 
> In short we need a complete description of the study design and the beginning and endinfg of observation.
> 
> 
> 
> Dear statalisters,
> 
> I am doing a project on duration of unemployment. I want to compare models with and without unobserved heterogeneity. I want to use -hshaz- module to estimate a mixture model but I couldn't find example on how to do that. I will appreciate any help where to find examples.
> 
> Thanks,
> Melaku
> 
> 
> On Aug 2, 2011, at 3:25 AM, [email protected] wrote:
> 
> Hello statalisters,
> 
> I analyze employment data using survival method for a length of 12 months. I decided to do so because some of my observations are younger to experience the event (in this case exiting unemployment) for more than 12 months; that is I observe them only for 12 months. To overcome this problem I imposed a 12 months period of analysis for all of my observations. That is all observations have equal length of 12 months to experience the event. I did so by artificially censoring those observations for whom I have data for more than 12 months and did not experience the event within 12 months. These are old individuals. I did censor even though I see some of these observations experience the event later, after the 12 months period.
> 
> My questions:
> 1. Should I include in the analysis those observations that I censored?
> 2. Is the sample data presented below appropriate for survival analysis? Note that all of observations experience the event except those I censored at the 12 month.
> 
> Below is a small representation of my data. The failure variable 'Failure' is cross-tabulated with the variable 'studytime' which is the number of months until experiencing the event.
> 
>  Failure
>  0 | 1
>   ------
> 1    0 | 200
> 2    0 | 89
> 3    0 | 70
> 5    0 | 68
> 6    0 | 58
> 7    0 | 50
> 8    0 | 51
> 10   0 | 45
> 11   0 | 30
> 12   150 | 0
> 
> Thanks,
> Melaku
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index