Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: sensitivity and specificity after xtgee
From
"Joseph Coveney" <[email protected]>
To
<[email protected]>
Subject
st: Re: sensitivity and specificity after xtgee
Date
Mon, 17 Sep 2012 14:33:51 +0900
Mohammadreza Mohebbi wrote:
Is there any post-estimation option after running xtgee/ xtlogit (for
binary longitudinal regression) for calculating
sensitivity, specificity, positive and negative predictive values,
positive and negative likelihood ratios and ROC curve?
I used xtgee to fit binary regression for wheez syndrome with 3
measurements (baseline, first year and second year follow-ups) with
PD20 as exposure and broadcat and smoker_current as covariates:
xtset h_id h_int_no
xtgee wheez_symp pd_20_base i.broadcat smoker_current ,
corr(exchangeable) fam(binom) link(log/logit) eform robust
--------------------------------------------------------------------------------
-findit sensitivity- doesn't bring up anything mentioning -xt- commands except
for a couple of user-written commands (-metandi-, -midas-) that are intended to
be used in the context of meta-analysis of a collection of diagnostic
effectiveness studies. References cited in their help files might give you some
insight into computing a summary ROC AUCs and the like after fitting
hierarchical logistic models.
In your case, though, couldn't you use -predict , mu- and go from there?
Something like that below? Because "broadcat" and "smoker_current" are
time-varying, you'd need to compute sensitivity etc. at each time point.
Joseph Coveney
version 11.2
clear *
set more off
drawnorm u pd20, mean(0 180) corr(1 -0.5 \ -0.5 1) ///
sd(1 180) seed(`=date("2012-09-17", "YMD")') n(300) // !
generate int patient_nr = _n
local a 1
local b 4
forvalues visit = 1/3 {
generate byte broad_category`visit' = ///
floor((`b' - `a' + 1) * runiform() + `a')
generate byte smokes`visit' = floor(2 * runiform())
generate double xb`visit' = ///
0.0001 * pd20 + ///
0.375 * (2.5 - broad_category`visit') + ///
0.375 * (smokes`visit' - 0.5) + ///
1 * (`visit' - 2 ) + /// !
u + rnormal()
}
quietly reshape long broad_category smokes xb, i(patient_nr) j(visit)
label define Visits 1 Baseline 2 "First year" 3 "Second year"
label values visit Visits
set seed0 `=date("2012-09-17", "YMD")'
genbinomial wheezes, xbeta(xb) // -findit genbinomial- to install
preserve
*
* A little more difficult
*
xtgee wheezes c.pd20 i.(broad_category smokes), ///
i(patient_nr) family(binomial) link(logit) eform ///
corr(independent) robust nolog
predict double pr, mu
// AUCs of ROC curves
tempname B
matrix define `B' = e(b)
forvalues visit = 1/3 {
display in smcl as result _newline(2) "`: label (visit) `visit''"
lroc wheezes if visit == `visit', beta(`B') nograph
}
// The rest
tempname my_favorite_threshold
scalar define `my_favorite_threshold' = 0.5
generate byte TN = pr < `my_favorite_threshold' & !wheezes
generate byte FN = pr < `my_favorite_threshold' & wheezes
generate byte TP = pr >= `my_favorite_threshold' & wheezes
generate byte FP = pr >= `my_favorite_threshold' & !wheezes
assert (TN + FN + TP + FP) == 1
collapse (sum) TN = TN FN = FN TP = TP FP = FP, by(visit)
generate double sensitivity = TP / (TP + FN)
generate double specificity = TN / (TN + FP)
generate double likelihood_ratio_positive = sensitivity / (1 - specificity)
generate double likelihood_ratio_negative = (1 - sensitivity) / specificity
generate double positive_predictive_value = TP / (TP + FP)
generate double negative_predictive_value = TN / (TN + FN)
format sensitivity-negative_predictive_value %04.2f
local linesize `c(linesize)'
set linesize 200
list, noobs abbreviate(30)
set linesize `linesize'
*
* A little easier
*
restore
logistic wheezes c.pd20 i.(broad_category smokes), ///
cluster(patient_nr) nolog
forvalues visit = 1/3 {
display in smcl as result _newline(2) "`: label (visit) `visit''"
lroc wheezes if visit == `visit', nograph
estat classification if visit == `visit', ///
cutoff(`=`my_favorite_threshold'')
}
exit
Prof. Gary Koch once mentioned in a seminar I attended years ago that if you're
going to specify robust anyway ("empirical", I think it's called in SAS's PROC
GENMOD, which is what he uses) in order to cover your backside for misspecified
covariance structure, then you might as well specify independence for it. It's
least likely to give rise to convergence problems.
Assuming that he wasn't just playing devil's advocate for the sake of
stimulating discussion (covariance structure specification does affect
regression coefficients at least a little, after all), if you're going to fit a
marginal model, then the easiest way for you would be to use -logistic- or
-logit- (with the -cluster()- option, not that it makes any difference to point
estimates), and take advantage of the postestimation commands Stata has for
them.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/