Hi,
I'm a graduate student who is new to Stata. For my thesis, I'm trying to figure out how I can test nested models when I'm forced to use robust standard errors. Stata tells me that I can't use lrtest and I understand that, since it depends on maximum likelihood estimates. So what does one use?
Here's what I did. Having done an initial backwards stepwise logistic regression at pr(0.2), I would like to manually create a parsimonious model with the best possible fit. I assume that Stata is using some decision rule to drop variables during the stepwise procedure; is this what I should use when I try to drop them manually? What is Stata's decision rule for stepwise logistic regression using robust standard errors?
I found nothing in the manual and nothing helpful after extensive searching on the web.
Thanks so much!
Magda Szumilas
**************************************
An example below:
. xi: sw logistic usemh3 i.grade sexorcat markcat partcat livecat edumomcat edudadcat sexriskcat anysmoke if sex==1, cluster(site) pr(0.2)
i.grade _Igrade_10-12 (naturally coded; _Igrade_10 omitted)
begin with full model
p = 0.6664 >= 0.2000 removing markcat
p = 0.6006 >= 0.2000 removing edumomcat
p = 0.5856 >= 0.2000 removing _Igrade_12
p = 0.2054 >= 0.2000 removing sexorcat
p = 0.2113 >= 0.2000 removing _Igrade_11
p = 0.2592 >= 0.2000 removing partcat
Logistic regression Number of obs = 580
Wald chi2(1) = .
Prob > chi2 = .
Log pseudolikelihood = -266.26595 Pseudo R2 = 0.0691
(Std. Err. adjusted for 3 clusters in site)
------------------------------------------------------------------------------
| Robust
usemh3 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
livecat | .5896426 .1786052 -1.74 0.081 .325654 1.067631
edudadcat | 1.602875 .1808557 4.18 0.000 1.284863 1.999597
sexriskcat | .4266733 .0246379 -14.75 0.000 .3810162 .4778014
anysmoke | 2.502815 .266854 8.60 0.000 2.030824 3.084503
------------------------------------------------------------------------------
. estimates store full
. xi: sw logistic usemh3 i.grade sexorcat markcat partcat livecat edumomcat edudadcat anysmoke if sex==1, cluster(site) pr(0.2)
i.grade _Igrade_10-12 (naturally coded; _Igrade_10 omitted)
begin with full model
p = 0.6856 >= 0.2000 removing markcat
p = 0.5475 >= 0.2000 removing _Igrade_12
p = 0.2756 >= 0.2000 removing sexorcat
p = 0.2803 >= 0.2000 removing partcat
p = 0.2756 >= 0.2000 removing _Igrade_11
Logistic regression Number of obs = 600
Wald chi2(1) = .
Prob > chi2 = .
Log pseudolikelihood = -284.2349 Pseudo R2 = 0.0489
(Std. Err. adjusted for 3 clusters in site)
------------------------------------------------------------------------------
| Robust
usemh3 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
livecat | .7364448 .1749249 -1.29 0.198 .4623359 1.173067
edudadcat | 1.366027 .2559876 1.66 0.096 .9461231 1.97229
edumomcat | 1.35079 .278478 1.46 0.145 .901788 2.02335
anysmoke | 2.571288 .1562286 15.54 0.000 2.282615 2.896468
------------------------------------------------------------------------------
. lrtest full
LR test likely invalid for models with robust vce
r(498);
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/