Jannik Helweg-Larsen
>
> Using Stata 7SE:
>
> Spend most of the weekend trying to get my data in the
> right format for a
> stcox regression analysis with time-varying variables.
>
> tried expand, stsplit and recode without getting it right:
>
> I am looking at the possible impact of changing antibiotic
> treatment during
> an episode of infection.
>
> Here is an constructed example:
>
> Most patients in the dataset are treated with a single
> antibiotic regimen
> without changes, but others are changed to second-line
> antibiotic treatment
> during the treatment course as a consequence of either
> toxicity or treatment
> failure.
>
> My data are recorded like this
>
> id treat tx1 tx1end
> treat2 tx2 tx2end
> 1 pencillin 27oct1990 17nov1990
> 2 cefuroxime 12jun1993 05jul1993
> vancomycine 05jul1993 11jul1993
> 3 pencillin 20feb1990 08mar1990
> 4 pencillin 31oct1991 02nov1991
> cefuroxime 02nov1991 23nov1991
> 5 pencillin 01jun1989 12jun1989
> 6 pencillin 08dec1992 28dec1992
> 7 pencillin 15nov1994 06dec1994
> cefuroxime 06dec1994 16dec1994
> 8 pencillin 12oct1989 23oct1989
>
> I would somehow like to recode my data to get multiple
> records for the
> patients who are switched with a split at the date of
> treatment change, like
> this:
>
> id treat date0 date1
> 1 pencillin 27oct1990 17nov1990
> 2 cefuroxime 12jun1993 05jul1993
> 2 vancomycine 05jul1993 11jul1993
>
In a word, -reshape-.
As so often happens, you may need to -rename- some variables first.
. l
id treat tx1 tx1end treat2
tx2 tx2end
1. 1 pencillin 27oct1990 17nov1990
2. 2 cefuroxime 12jun1993 05jul1993 vancomycine
05jul1993 11jul1993
3. 3 pencillin 20feb1990 08mar1990
4. 4 pencillin 31oct1991 02nov1991 cefuroxime
02nov1991 23nov1991
5. 5 pencillin 01jun1989 12jun1989
6. 6 pencillin 08dec1992 28dec1992
7. 7 pencillin 15nov1994 06dec1994 cefuroxime
06dec1994 16dec1994
8. 8 pencillin 12oct1989 23oct1989
. rename tx1end end1
. rename tx2end end2
. rename treat treat1
. reshape long tx treat end , i(i)
(note: j = 1 2)
Data wide -> long
----------------------------------------------------------------------
-------
Number of obs. 8 -> 16
Number of variables 7 -> 5
j variable (2 values) -> _j
xij variables:
tx1 tx2 -> tx
treat1 treat2 -> treat
end1 end2 -> end
----------------------------------------------------------------------
-------
. l
id _j treat tx end
1. 1 1 pencillin 27oct1990 17nov1990
2. 1 2
3. 2 1 cefuroxime 12jun1993 05jul1993
< snip >
14. 7 2 cefuroxime 06dec1994 16dec1994
15. 8 1 pencillin 12oct1989 23oct1989
16. 8 2
. drop if treat == ""
(5 observations deleted)
. l
id _j treat tx end
1. 1 1 pencillin 27oct1990 17nov1990
2. 2 1 cefuroxime 12jun1993 05jul1993
3. 2 2 vancomycine 05jul1993 11jul1993
4. 3 1 pencillin 20feb1990 08mar1990
5. 4 1 pencillin 31oct1991 02nov1991
6. 4 2 cefuroxime 02nov1991 23nov1991
7. 5 1 pencillin 01jun1989 12jun1989
8. 6 1 pencillin 08dec1992 28dec1992
9. 7 1 pencillin 15nov1994 06dec1994
10. 7 2 cefuroxime 06dec1994 16dec1994
11. 8 1 pencillin 12oct1989 23oct1989
In addition to the manual entry at [R] reshape, some
further advice is gathered at
http://www.stata.com/support/faqs/data/reshape3.html
I copied your example from your mailing, so Stata read
-treat- as string. If it is numeric,
. drop if treat == .
Another way of renaming is to use -renvars- from STB-60.
This can be quicker for multiple renames.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/