Hi,
I am using the stcascoh.ado to model a case-cohort dataset. To figure out how things work and if things go along properly, I tried to reproduce the example of BArlow (Barlow WE, Ichicawa L, Rosner D, and Izumi S: Analysis of Case-Cohort Designs; Journal Clinical Epidemiology 1999; 52: 1165-1172) from the Breslow and Day nickel dataset.
The dataset provided on the net (http://lib.stat.cmu.edu/general/robphreg) used by Barlow is a the full dataset, however their results are based on an specific subcohort-sample. - stcascoh - samples a subcohort to prepare the dataset for analysis. Thus I expect this dataset always to be slightly different to the one of Barlow, however, results should be quite similar.
My problem is that I am not getting there and cannot reproduce the results (I tried it in SAS and it worked).
At examining the dataset after - stcascoh - I had the impression that weights were wrongly assigned in the programm in lines 179 and 184. But I am not very familar with the programming language. However, also unweighted results do not work out properly. I would be very grateful for some hint.
Here is what I do in Stata 7:
/***********************************************
* *
* INPUT THE DATASET AS PROVIDED ON THE NET *
* *
***********************************************/
input id icdcode exposure dob age_emp age_strt age_stop subco20
3 0 5 1889.0192 17.4808 45.2273 92.9808 0
4 162 5 1885.978 23.1864 48.2684 63.2712 0
...
0 0 0 1895.5 27.6753 38.7465 39.7219 0
end
/*************************************************
* *
* LABELING, CODING AND RECODING AS PROVIDED *
* IN THE SAS PROGRAM ON THE NET *
* *
*************************************************/
replace id=990 if id==0 & icdcode==490
replace id=991 if id==0 & icdcode==177
replace id=992 if id==0 & icdcode==430
replace id=993 if id==0 & icdcode==0
lab var id "identification number"
lab var icdcode "160=death due to nasal sinus cancer"
lab var exposure "exposure level"
lab var dob "Date of birth"
lab var age_emp "Age at first employment"
lab var age_strt "Age at start of study"
lab var age_stop "Age at death or end of study, whichever is earlier"
lab var subco20 "1=included in 20% subcohort, 0=not included"
gen yfe10=( dob+age_emp-1915)/10
gen yfe100=(( age_emp-1915)^2)/100
gen logafe=log(age_emp-10)
gen logexp=log(exposure+1)
/***********************************************
* *
* PREPARING THE DATASET FOR ANALYSIS *
* USING TIME SINCE EMPLOYMENT AS TIME AXIS *
* AND THE 20% SAMPLE FRACTION *
* AS IN BARLOW *
* *
***********************************************/
stset age_stop,failure(icd==160) id(id) enter( age_strt) exit( age_stop) origin( age_emp)
stcascoh, alpha(20)
/*******************************************************************
* *
* LASTLY I MODEL AS SUGESTED IN THE HELP FILE *
* OF - STCASCOH - *
* *
* 1- Prentice: stcox varlist, robust *
* 2- Self and Prentice: stcox varlist, offset(_wSelPre) robust *
* 3- Barlow: stcox varlist, offset(_wBarlow) robust *
* *
*******************************************************************/
stcox yfe10 yfe100 logafe logexp, robust nohr
stcox yfe10 yfe100 logafe logexp, robust offset(_wSelPre) nohr
stcox yfe10 yfe100 logafe logexp, robust offset(_wBarlow)nohr
What is wrong?
Thanks a lot.
Veit Grote
.
----------------------------------------
Dr. Veit Grote, MSc
Klinikum der Universit�t M�nchen
Dr. v. Haunersches Kinderspital
Lindwurmstr. 4
D-80337 M�nchen
Tel.: +49 (89) 5160-7798
Fax: +49 (89) 5160-2951
[email protected]
----------------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/