Ricardo Ovaldia wrote:
Fri, 30 Jul 2004
My understanding is that the three commands; -clogit-,
-xtlogit- and �logistic, cluster- can all be use to
performed logistic regression when observations
(outcomes) are sampled within groups (i.e. correlated
data where the independence assumption is violated).
After reading the manual entries for these commands
and examining Hosmer and Lemeshow�s book I have been
unable to determine under what conditions these model
are appropriate, and in frustration I a writing to ask
if anyone knows of a document, website, etc that
compares and contrasts these methods?
Tue, 3 Aug 2004
I am comparing the hospital referral rate of CHD
patients by 50 physicians. I would like to model the
rate using both patient and physician characteristics.
Because patient referrals are most likely not
independent within physician, I though to use
conditional logistic regression grouping on physician
ID to model the outcome. The problem with this
approach is that because physician characteristics do
not vary within physician they are dropped. I
decided, therefore, to use random effects logistic
regression, -xtlogit- instead. My concern is that I am
not sure that I can correctly justify this approach
solely on the above argument. Does anyone have any
thoughts or literature I can read regarding this
justification, or am I way off track?
--------------------------------------------------------------------------------
-logit, cluster()- produces the same results as -xtgee, family(binomial)
link(logit) corr(independent) robust- (this came up on the list last month in
the context of -mlogit, cluster()-, so I would recommend avoiding that approach
in circumstances in which population-averaged GEE would not be ideal. There
are those who would say that GEE is never ideal, but even among its adherents,
most would caution that, with only 50 physicians, GEE would be a little dicey.
-xtlogit, fe- would help see the influence of patient characteristics upon a
physician's inclination to refer, while, in a sense, controlling for physician
characteristics. (Where the predictor variables for patient characteristics do
not vary within a physician, the entire physician's caseload would be dropped.)
As you mention, because physician's characteristics do not vary within a
physician, -xtlogit, fe- doesn't seem to be the way to go to explore both
patient and physician characteristics together.
-xtlogit, re- would seem to be the remaining alternative available in Stata,
unless I'm overlooking something. Cautions would be similar to the case with
GEE. The number of physicians is limited. If there is a substantial
correlation between the fixed effects (physician covariates) and the random
effect, then the parameters are liable not to be consistently estimated.
Joseph Coveney
--------------------------------------------------------------------------------
clear
set more off
local seed = date("2004-08-05", "ymd")
set seed `seed'
macro drop seed
set obs 400
generate int pid = _n
generate float mu = uniform()
generate byte den = 1
forvalues i = 1/6 {
rndbinx mu den
rename bnlx dep`i'
}
compress
replace mu = mu + invnorm(uniform())
drop den
reshape long dep, i(pid) j(tim)
xi: xtgee dep i.tim mu, i(pid) family(binomial) link(logit)
/// corr(independent) robust nolog
xi: logit dep i.tim mu, cluster(pid) nolog
exit
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/