|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: time efficient way to choose variables
I have data in which I want to pick out variables associated with
developing a disease. Each time I run the foreach command with the
covariates, I cut out the one variable with the highest Z value with p
value <0.05, and I put this variable in the second equation (stcox)
until I have no variables with p value <0.05 left when I run the models
with the foreach command.
Here is an example below:
foreach var of varlist agegrp racecode1 s_sex1 ses_pov ajcc6seer6_1
sizeband pnnumb grade_s lung4 comorbid treat2r xrt3 seer1 dxyear_cate {
stcox PAC1 `var`
}
Then I choose the variable with the highest z score with p value <0.05
Then run the model again. Comorbid is taken out because of its highest Z
score and placed in the second equation.
foreach var of varlist agegrp racecode1 s_sex1 ses_pov ajcc6seer6_1
sizeband pnnumb grade_s lung4 treat2r xrt3 seer1 dxyear_cate {
stcox PAC1 comorbid `var`
}
Third run:
Sizeband was chosen because of the highest Z score with p value <0.05
This was placed in the second model:
foreach var of varlist agegrp racecode1 s_sex1 ses_pov ajcc6seer6_1
pnnumb grade_s lung4 treat2r xrt3 seer1 dxyear_cate {
stcox PAC1 comorbid sizeband `var`
}
I do this until there is no more variables with p value <0.05 to choose
from.
1. My question is how can I do this process very quickly and time
efficient.
Do I use an array? Can you show me how to do this?
2. Is there also a time efficient process in looking for effect
modifiers using several variables (one at a time in separate models)
using the likelihood ratio test?
Thanks.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/