Dear Stata-experts:
I am a newbie at survival analysis, and I would appreciate
your help with propoerly setting up the dataset for the analysis
with Stata 7.
I have annual data for 15 years from a panel household survey
on living arrangements of elderly households.
My original data come in the form of an _unbalanced_ dataset,
where the data are organized by ID and year (iis ID, tis year
in xt-language).
The data set covers the period 1984 through 1998 (15 years in
total, but elderly households are in my sample only if the head
of the household is older than 65. If they die or drop out of the
survey, I have no records for them after the death or drop-out.
Here is an example of how some records look like (ID 20302 is
particularly interesting, I think), where:
ID denotes elderly household id
year denotes year
indep denotes an indicator variable taking the value 1 if the elderly live
independently in that year, and 0 otherwise.
age denotes age of the head of the household.
ID year indep age
201 91 1 65
201 92 1 66
201 93 1 67
201 94 1 68
201 95 1 69
201 96 1 70
201 97 1 71
201 98 1 72
1101 84 1 78
1101 85 1 79
1101 86 1 80
1101 87 1 81
1101 88 1 82
1101 89 1 83
1101 90 1 84
1101 91 1 85
1101 92 1 86
1101 93 1 87
1101 94 1 88
1101 95 1 89
1101 96 1 90
1101 97 1 91
1101 98 1 92
3401 84 1 73
3401 85 1 74
3401 86 1 75
3401 87 1 76
3401 88 1 77
3401 90 0 79
3401 91 0 80
3401 92 0 81
20302 87 1 65
20302 88 1 66
20302 89 1 67
20302 90 0 68
20302 91 0 69
20302 94 0 72
20302 95 0 73
20302 96 0 74
20302 97 1 75
20302 98 1 76
53501 84 1 84
53501 85 1 85
53501 86 1 86
53501 87 1 87
53501 88 1 88
53501 89 1 89
53501 90 0 90
53501 91 0 91
53501 92 0 92
For the moment, I am interested in studying the probability
that an elderly household moves out of independence at time
T, conditional on having been independent up to then.
So, trying to follow Stata terminology, I believe I have
multiple record data with delayed entry and right censoring.
I also may have gaps in between spells of independent living.
For now I think I would like to have "origin" and "entry" both
coincide with the case in which I first observe the elderly living
independently (indep==1), "failure" to be the case in which I
observe the elderly move out of independence (first time I observe
them with indep==0, after having observed indep==1).
My question is: how do I set up the st data set?
I understand that I first need to convert the snap-shot data
in time span data. I take care of this step by doing:
" snapspan ID year `listvar', replace "
where listvar contains the list of variables I want to keep.
Next, I believe I need to do the stset step, but I cannot figure
out how to do it right.
The best I have come up with so far is:
" stset year,
id(ID) failure(indep==0) origin (indep==1) entry(indep==1) exit(time.)"
but I do not think it is correct. In case it matters, I get:
id: ID
failure event: indep == 0
obs. time interval: (year[_n-1], year]
enter on or after: indep==1
exit on or before: time .
t for analysis: (time-origin)
origin: indep==1
------------------------------------------------------------------------------
12617 total obs.
1702 ignored because never entered
1841 obs. end on or before enter()
------------------------------------------------------------------------------
9074 obs. remaining, representing
1285 subjects
285 failures in multiple failure-per-subject data
9237 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 14
and then listing records for the above IDs:
ID indep _st _t0 _t _d
201 1 0 . . .
201 1 1 0 1 0
201 1 1 1 2 0
201 1 1 2 3 0
201 1 1 3 4 0
201 1 1 4 5 0
201 1 1 5 6 0
201 1 1 6 7 0
1101 1 0 . . .
1101 1 1 0 1 0
1101 1 1 1 2 0
1101 1 1 2 3 0
1101 1 1 3 4 0
1101 1 1 4 5 0
1101 1 1 5 6 0
1101 1 1 6 7 0
1101 1 1 7 8 0
1101 1 1 8 9 0
1101 1 1 9 10 0
1101 1 1 10 11 0
1101 1 1 11 12 0
1101 1 1 12 13 0
1101 1 1 13 14 0
3401 1 0 . . .
3401 1 1 0 1 0
3401 1 1 1 2 0
3401 1 1 2 3 0
3401 1 1 3 4 0
3401 0 1 4 6 1
3401 0 1 6 7 1
3401 0 1 7 8 1
20302 1 0 . . .
20302 1 1 0 1 0
20302 1 1 1 2 0
20302 0 1 2 3 1
20302 0 1 3 4 1
20302 0 1 4 7 1
20302 0 1 7 8 1
20302 0 1 8 9 1
20302 1 1 9 10 0
20302 1 1 10 11 0
53501 1 0 . . .
53501 1 1 0 1 0
53501 1 1 1 2 0
53501 1 1 2 3 0
53501 1 1 3 4 0
53501 1 1 4 5 0
53501 0 1 5 6 1
53501 0 1 6 7 1
53501 0 1 7 8 1
--------------------------------------------------------------------------
Please let me know if you need fourther information.
Thank you very much in advance for any help you can give me.
Enrica Croda
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/