Assume -id- indicates the subjects (in all datasets), -a.dta- contains your demographic data, -b.dta- and -c.dta- are the marker1 and marker2 datasets, respectively. The variables -m1- and -m2- contain the values of the measurements of the markers 1 and 2, respectively. -t- ist the time of measurement in both marker datasets, -time- is the time of the event (in a.dta), -fail- indicates failure event vs. censored, x1, x2 ... are covariates. To summarize:
a.dta: id time fail x1 x2 ...
b.dta: id t m1
c.dta: id t m2
If you only want to use one marker, do the following:
. use b
. sort id t
. save bb, replace
. use a
. sort id
. joinby id using bb
. sort id t /* just to be sure */
. by id: replace time=t[_n+1] if _n!=_N
. by id: replace fail=0 if _n!=_N
If the measurement of the marker has occured after time zero, you can either assume that the marker has been constant up to the first measurement (A) or treat the case to be left censored (B).
A:
. stset time, id(id) failure(fail)
. stcox m1 x1 x2 ...
B:
. stset time, id(id) failure(fail) enter(t)
. stcox m1 x1 x2
If you need to use both markers, things are a little more complicated. First combine b.dta and c.dta, fill in some gaps and joinby (assuming that there are no simultaneous measurements of marker1 and marker2; you'd have to modify the code if there are simultaneous measurements):
. use b
. append using c
. sort id t
. by id: replace m1=m1[_n-1] if m1==.
. by id: replace m2=m2[_n-1] if m2==.
. save bc, replace
. use a
. sort id
. joinby id using bc
. sort id t /* just to be sure */
. by id: replace time=t[_n+1] if _n!=_N
. by id: replace fail=0 if _n!=_N
Then, again the two possibilities A (constant) and B (left censored):
A:
. gsort id -t
. by id: replace m1=m1[_n-1] if m1==.
. by id: replace m2=m2[_n-1] if m2==.
. sort id t
. stset time, id(id) failure(fail)
. stcox m1 m2 x1 x2 ...
B:
. stset time, id(id) failure(fail) enter(t)
. stcox m1 m2 x1 x2 ...
I hope this works. Didn't test it.
ben
> -----Urspr�ngliche Nachricht-----
> Von: n p [mailto:[email protected]]
> Gesendet: Donnerstag, 31. Juli 2003 13:29
> An: [email protected]
> Betreff: st: Survival with time varying covariates
>
>
> Hi Stata users,
>
> I need your help in the following problem. I want to
> perform a survival analysis (via Cox PH models) where
> some covariates (e.g. demographics) are constant over
> study time while others (Marker1 and Marker2) are
> periodically measured (usually after time zero and
> not simultaneously). Thus I have 3 datasets
>
> a) 1 record per subject, with time of event or
> censoring and other not time varying covariates
> b) n1 records per subject for repeated measurements of
> marker1 (time and value of measurement)
> c) n2 records per subject for repeated measurements of
> marker2 (time and value of measurement)
>
> The effect of both markers on the hazard is the main
> question in this study.
>
> I checked the manual's example "Stanford heart
> transplant data" and the "stsplit" command. I also
> checked "stegen" and "stcoxtvc" using "findit time
> varying covariates" but I am still puzzled with this.
> Is there an easy way to prepare the data at least for
> one of the two markers? How can I deal with the
> unknown markers' values between time zero and their
> first measurement?
>
> I am using Stata 7 on a 2.4 GHz 512 RAM WinXP machine.
>
> Thanks in advance for any help.
>
> Nikos Pantazis
> Biostatistician
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design software
> http://sitebuilder.yahoo.com
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/