Hi Chelsea,
The strategy will differ depending on what the purpose of the survival
analysis is.
Broadly speaking, there are two different reasons for modelling data:
(1) You are interested in the effect of a particular variable and you
want to estimate this by accounting for the possible confounding effects
of other variables, or
(2) you are interested in finding a parsimonious set of predictor
variables.
In (1), statistical significance plays no role in selecting confounders.
Confounding is a bias. A variable either confounds your exposure the
estimate or it doesn't.
In (2), the fit of the model is important, so the significance of the
variables you choose will important.
It sounds like you have an exposure that you're interested in, so I
would suggest that you select variables based on their confounding
effect. The strategy is called the "change-in-estimate" method. Have a
look at:
Greenland S (1989). Modeling and variable selection in epidemiologic
analysis. Am J Public Health 79(3): 340-9.
This outlines the strategy and Zhiqiang Wang wrote an ado file
-epiconf-, which you may find useful.
Also, a chi-square test of exposure against outcome will generally be of
no use because it does not account for time to event.
______________________________________________
Kieran McCaul MPH PhD
WA Centre for Health & Ageing (M573)
University of Western Australia
Level 6, Ainslie House
48 Murray St
Perth 6000
Phone: (08) 9224-2701
Fax: (08) 9224 8009
email: [email protected]
http://myprofile.cos.com/mccaul
http://www.researcherid.com/rid/B-8751-2008
______________________________________________
Man is a credulous animal, and must believe something; in the absence of
good grounds for belief,
he will be satisfied with bad ones. Bertrand Russell
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Polis,
Chelsea B.
Sent: Thursday, 11 June 2009 2:04 AM
To: [email protected]
Subject: st: Choosing control variables in survival analysis
Dear Statalisters,
I am doing a survival analysis, which uses multiple records per
individual to incorporate
time-varying exposure information.
I plan on building two multivariate Cox regression models: (Model A)
includes potential
confounders based strictly on statistical significance, and (Model B)
includes those variables
plus others I think should be included based on theoretical concerns or
comparability to previous studies.
I have been including a variable in Model A if a variable is associated
with my dichotomous
exposure (in chi2 tests) and time to event (in univariate Cox
regression).
However, I am not sure whether to include it in Model A if it is NOT
significantly associated with
time to event in Cox regression, but IS significantly associated with
having HAD the event in a
chi2 test?
Your thoughts would be much appreciated.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/