Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and out of risk set
From
"Laura Rodwell" <[email protected]>
To
<[email protected]>
Subject
st: RE: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and out of risk set
Date
Tue, 15 Mar 2011 10:19:44 +1100
Hi Kathleen,
What you are describing sounds like gap time, or interval truncation in a survival dataset, where a subject is not observed for a period of time, in your example when they enter waged employment. To account for this in the multiple record data format as you have presented you have the subject exit at the known failure time then re-enter at the next start time. So if your subject fails at Year = 1994 then re-enters self employment in 1996 then 1996 would be the year0 on your next row - so they are not being observed in 1995. You might need to have your dates in months or at least half yearly periods to identify the gaps. Once you have your data in this format then you can stset with the appropriate start and time0 specifications.
Cleves describes this in Chapter 5 - recording survival data (Third edition, pg 38).
Another thing I noticed is you currently have someone with a failure in that 1994-1995 period where you have them as a 0 for self-employed. To fail you need to be at risk so again this may be a matter of being more specific with your times and ensuring your failure events match with your times at risk.
I think you mentioned it however I always find this FAQ document useful when setting up this type of survival data, particularly with the actual specification of the stset.
http://www.stata.com/support/faqs/stat/stmfail.html
Regards,
Laura.
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Kathleen Bui
Sent: Monday, 14 March 2011 2:31 AM
To: [email protected]
Subject: st: Stset-ing Multiple Failure/Multiple Spell Data : Moving in and out of risk set
My question is how to stset a multiple failure data set when an individual can
move in and out of the risk set.
I have read Cleves’s An Introduction to Survival Analysis Using Stata, Cleve’s
STB-49, and all previous posts concerning st-setting multiple failures. Others
have asked similar questions as mine, but I have yet to find a solution that
works.
I am analyzing the duration of an individual’s stay in Self-Employment. Failure
will be exit from self-employment. My question is how can I stset the data so
that Stata recognizes that an individual can move into and out of the risk set
(which is being Self-Employed).
To be more explicit, for each individual in my data set, I have information as
to whether or not they are Self-Employed. The issue arises when an individual
has a self employment history as follows:
The individual is self-employed and therefore at risk of failure. Then they
fail (leave self employment) and enter waged employment. By entering waged
employment, they are no longer at risk of failing, since they are no longer
Self-Employed. However, after a period of time, they once again become Self
Employed (thus re-enter the risk set) and fail once again (their second
failure).
As a result, multiple failures are possible as individuals are moving in and out
of different employment states. However, although I understand that Stata can
recognize multiple failures, I am unsure of how stset can be used to recognize
the multiple spells of Self-Employment, particularly the period of time between
spells when the individual is no longer at risk.
Specifically, I am unable to set the analysis time back to 0 for when the
individual begins a second period at risk after being not at risk.
For example, one individual in my data set of multiple individuals can look
like:
+----------------------------------------------------------------------+
| ID Year0 Year SelfEmploy Failure |
|--------------------------------------------------------------------|
1. | 1 1989 1990 0 0 |
2. | 1 1990 1991 1 0 |
3. | 1 1991 1992 1 0 |
4. | 1 1992 1993 1 0 |
5. | 1 1993 1994 1 0 |
6. | 1 1994 1995 0 1 |
7. | 1 1995 1996 0 0 |
8. | 1 1996 1997 1 0 |
9. | 1 1997 1998 1 0 |
10. | 1 1998 1999 1 0 |
11. | 1 1999 2000 0 1 |
+-------------------------------------------------------------------+
where “SelfEmploy” is the indicator variable denoting whether or not the
individual is self employed, “Failed” is an indicator variable denoting if the
individual has left self employment and year0 and year are the corresponding
beginning and end of time period.
So between, 1990 and 1994, the individual is at risk of failing, and fails
between 1994 and 1995. But between 1995 and 1996, they are no longer at risk of
failing (say they are employed in the waged sector). But then they enter self
employment in 1996 and thus experience another failure between in 1999-2000.
Is there a command in stset that allows Stata to “ignore” the periods when they
are no longer at risk?
For example, when I stset my data as follows: stset year, origin(SelfEmploy==1)
failure(Failed) time0(Year0) id(PersonID) exit(time .), the period when they
are no longer at risk of failing is treated as if they are in self-employment as
the output I receive is:
+---------------------------------------------------------------------------------------- +
| ID Year0 Year SelfEmploy Failure _s _d _t0 _t |
|--------------------------------------------------------------------------------------------|
1. | 1 1989 1990 0 0 0 0 .
. |
2. | 1 1990 1991 1 0 0 0 .
. |
3. | 1 1991 1992 1 0 1 0 0
1 |
4. | 1 1992 1993 1 0 1 0 1
2 |
5. | 1 1993 1994 1 0 1 0 2
3 |
6. | 1 1994 1995 0 1 1 1 3
4 |
7. | 1 1995 1996 0 0 1 0 4
5 |
8. | 1 1996 1997 1 0 1 0 5
6 |
9. | 1 1997 1998 1 0 1 0 6
7 |
10.| 1 1998 1999 1 0 1 0 7
8 |
11.| 1 1999 2000 0 1 1 1 8
9 |
+-------------------------------------------------------------------------------------------+
Stata seems to count the period form 1995-1996,as a time where the individual is
at risk of failing, when he is not.
Therefore, am unsure as to how to st-set the data so that from 1995-1996, Stata
recognizes that the individual is no longer at risk of failing and that my
analysis time can be “Reset” to 0 for when the individual begins a second period
at risk after being not at risk.
Any suggestions?
Any help would be appreciated!
Thanks!!
Kathleen Bui
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/