[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: multivariate Poisson regression and SIRs

From	"raoul reulen" <[email protected]>
To	[email protected]
Subject	st: multivariate Poisson regression and SIRs
Date	Thu, 28 Jun 2007 19:23:49 +0100

Dear Statalisters,


I am analyzing a cohort study on the risk of breast cancer after
surviving childhood cancer. The cohort includes 8000 female survivors
of childhood cancer of whom 75 subsequently have developed breast
cancer. I have calculated standardized incidence ratios of breast
cancer by type of childhood cancer, treatment decade, follow-up time,
attained age etc.


To do this I have stset and then stsplit the data by calendar period
and age and merged with reference rates from the general population. I
have also split on attained age and risk interval (=time since
childhood cancer diagnosis) because these variables vary with time.
Then I have calculated the expected number of breast cancers by
multiplying the person-years in each stratum by this rate.


**stset and split data***

.stset dox, fail(fail) origin(dob) entry(doe) scale(365.25) id(id)

.stsplit ageband, at(0 1 5(5)85)

.stsplit calender_period, after(time=d(1/1/1900)) at(0 71(1)106)

.replace calender_period = calender_period + 1900


***merge to external ref rates****

.sort ageband calender_period

.merge ageband calender_period using rates/tmp_rates.dta

.drop if _merge!=3


**split on age and time since childhood cancer diagnosis**

.stsplit age, at(0,20,30,40,50) //attained age

.stsplit riskint, after(time=doe) at(5,10,15,20,25,30,35)



****calculate Expected and Personyrs****

.gen  pyrs = _t - _t0

.gen  E=(pyrs*rate)



To calculate the SIRs by childhood cancer diagnosis I collapsed the
data by diagnosis. The variable diagnosis includes 10 different
categories (leukaemia, Hodgkin, Non-Hodgkin etc.).

I used Poisson regression to calculate the Incidence Rate Ratio, which
is essentially a ratio of SIRs (the baseline SIR, from the leukaemia
group,which is he reference group, versus the group of interest).



. collapse (sum) _d E pyrs, by(diag)

. xi:poisson _d i.diag  if E!=0, exposure(E)  irr



Now from the Incidence Rate Ratio I would like to calculate the SIR
for each group, so I used:



.predict coef, xb nooffset

.gen SIR=exp(coef)

. bysort diag: sum SIR



I did this for all the variables I am interested in. I know there are
easier ways to calculate the SIR, but I have used this approach
because I like to calculate SIRs in a multivariate Poisson model as
well and I thought that this would be the best approach.



So next I used a multivariate approach. I collapsed the data on all
variables that I am interested in.



. collapse (sum) _d E pyrs, by(diag trtagegp rt ct trt_dec riskint age)

. xi:poisson _d i.diag i.trtagegp i.rt i.ct i.trt_dec i.riskint i.age
if E!=0, exposure(E)  irr



_d = observed number of breast cancers

diag= diagnostic group (type of childhood cancer)

trtagegp= age at start of childhood cancer treatment (0-4, 5-9, 10-14)

rt= treatment with radiotherapy (yes/no)

ct = treatment with chemotherapy (yes/no)

trt_dec = decade of initial treatment (1970-1979, 1980-1989, >1990)

riskint = time since childhood cancer diagnosis

age= age


Now my questions is:


the output of the model gives me Incidence Rate Ratios and uses the
first category of each variable as the reference category. How do I
get (adjusted) SIRs? I guess I could use the same approach as above
and use:


.predict coef, xb nooffset


But what do I do next?

Thanks,

Raoul
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: warning: overlapping samples
Next by Date: re: st: tab for 3x3
Previous by thread: st: RE: Bootstrap of std err. of difference in coefficients of 2 regressions
Next by thread: st: Fixed effects - DD question
Index(es):
- Date
- Thread