I am analysing a dataset of 300,000 people in 90 hospitals. My outcome
variable is elective caesarean and I'm looking at four independent
variables: age, gestation, deliveries per doctor and doctors per bed. Below
I have reproduced a small sample of output. As you can see, neither deldoc
nor docbed are significant predictors in the model but when I test if they
are jointly significant (using "test"), I find they have a p-value of
0.0175. My question is should I include neither, either or both of them in
my final model? They are clearly related to each other and if I include
either one of them they have p-values of less than 0.05.
Many thanks for your help
Bernadette
xi: logistic elec startage i.gestat deldoc docbed , robust cluster(provid)
i.gestat _Igestat_4-6 (naturally coded; _Igestat_4 omitted)
Logit estimates Number of obs = 193530
Wald chi2(5) = 1566.63
Prob > chi2 = 0.0000
Log likelihood = -31761.251 Pseudo R2 = 0.1203
(standard errors adjusted for clustering on provid)
---------------------------------------------------------------------
| Robust
elec | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+------------------------------------------------------
startage | 1.081758 .0030363 28.00 0.000 1.075823 1.087725
_Igestat_5 | .2074252 .0160341 -20.35 0.000 .1782638 .2413569
_Igestat_6 | .0801571 .0061749 -32.76 0.000 .0689239 .093221
deldoc | 1.000359 .0002938 1.22 0.221 .9997835 1.000935
docbed | .9896782 .0057416 -1.79 0.074 .9784886 1.000996
---------------------------------------------------------------------
r; t=9.26 11:56:36
. test deldoc
( 1) deldoc = 0.0
chi2( 1) = 1.50
Prob > chi2 = 0.2214
r; t=0.00 11:56:45
. test docbed
( 1) docbed = 0.0
chi2( 1) = 3.20
Prob > chi2 = 0.0737
r; t=0.00 11:56:51
. test docbed deldoc
( 1) docbed = 0.0
( 2) deldoc = 0.0
chi2( 2) = 8.10
Prob > chi2 = 0.0175
r; t=0.00 11:56:56
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/