Rachael,
I am not entirely sure I understand what you mean by episode, and how
you want to integrate episodes into your data. Is an observation equal
to an episode? Episodes usually have start and end dates, but from
what I read from your description, your data just have one date per
observation.
Take the following example, where I made up some dates:
. list id date, sepby(id)
+----------------+
| id date |
|----------------|
1. | 1 01may1990 |
2. | 1 10jun1990 |
3. | 1 23jul1990 |
|----------------|
4. | 2 10may1990 |
5. | 2 12may1990 |
6. | 2 18jul1990 |
7. | 2 21aug1990 |
|----------------|
8. | 3 30apr1990 |
9. | 3 01sep1990 |
10. | 3 02sep1990 |
11. | 3 15oct1990 |
+----------------+
id represents the individual. From what I understand from your email,
you want a variable called episode, which labels episodes (or the
beginning of episodes) for your individuals. The earliest record for
each individual marks the beginning of the first episode, and you want
the following episodes to be the first record occuring after previous
episode start date + X.
Here is some code to do this. I chose X = 25 days in my example. I'm
sure you could compress the code further somehow.
************************************************
gen episode = .
// the earliest entry is the first episode
bysort id (date): replace episode = 1 if _n==1
bysort id (date) : gen workdate = date[1] + 25
format workdate %d
// workdate is the earliest start date for the next episodes
// I want to get up to 4 episodes:
forvalues x = 2/4 {
bysort id (date) : replace episode = `x' if date[_n] >= workdate & episode==.
bysort id episode (date) : replace episode =. if episode==`x' & _n >
1 & episode < .
bysort id (episode) : replace workdate = date[`x'] + 25
}
sort id date
list id date episode
************************************************
Having the workdate variable is not strictly necessary, but it can
help in the process; it makes it easier to check if you are doing the
right thing.
This is the result:
. list id date episode, sepby(id)
+--------------------------+
| id date episode |
|--------------------------|
1. | 1 01may1990 1 |
2. | 1 10jun1990 2 |
3. | 1 23jul1990 3 |
|--------------------------|
4. | 2 10may1990 1 |
5. | 2 12may1990 . |
6. | 2 18jul1990 2 |
7. | 2 21aug1990 3 |
|--------------------------|
8. | 3 30apr1990 1 |
9. | 3 01sep1990 2 |
10. | 3 02sep1990 . |
11. | 3 15oct1990 3 |
+--------------------------+
Note that episode can be missing, if the corresponding observation is
less than X days away from the start of the previous episode.
If this is not what you want, you need to be much more precise about
your data, and what you want to achieve.
Hope this helps,
Eva
2008/9/22 Williams, Rachael <[email protected]>:
> Dear all,
>
> I have a dataset with multiple records per person recorded on different
> dates.
>
> I would like to generate episodes for each person.
> By this I mean that, for each person, the start date of the first
> episode will be defined as the date of the earliest record.
> The next episode for a person will be taken as the first record
> occurring after X days from the date of the previous episode.
>
> At the moment I am doing this in a very long winded fashion!
> If anybody has a neat bit of code that might do the job, I would very
> much appreciate it.
>
> If anything is not clear, apologies - please do ask.
> Rachael
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/