Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Generating correlated binary data for given ICC
From
Georgia Ntani <[email protected]>
To
[email protected]
Subject
st: Generating correlated binary data for given ICC
Date
Fri, 30 Nov 2012 10:18:40 +0000
Dear Statalisters
I apologise for posting the same question for a second time. However,
since I did not get any reply I thought it may be because of how I have
set up my question and I thought I should simplify it.
I am trying to generate binary correlated data for fixed parameter β
(effect of covariates on the outcome), number and size of clusters, and
ICC. For this, I follow the algorithm suggested by Santos et al (Santos
et al. Estimating adjusted prevalence ratio in clustered cross-sectional
epidemiological data. BMC Medical research methodology). Below is my
Stata syntax for this with corresponding comments.
/* Set up 10 clusters with 30 obs per cluster */
set obs 300
gen cluster=1 if _n<=30
forvalues i=2(1)10 {
replace cluster=`i' if _n<=`i'*30 & cluster==.
}
/* Level 2 binary explanatory variable x1 – equal number of clusters
each category */
gen x1=1 if cluster<=5
replace x1=0 if x1==.
/* Level 1 continuous independent variable X2ij//from a Normal(0,1)
distribution */
generate x2 = invnorm(uniform())
/* Generate error term. That will be a normal variable such that for
given cluster j, uoj ~ N(0,sigma_u^2 ), where uoj and uoj' are
independent for j ≠j'. The intraclass correlation coefficient (ICC) is
defined as (sigma_u^2 )/(sigma_u^2 +(pi^2 /3)) */
preserve
/* From the formula above,if ICC=0.03 then sigma_u will be
sqrt(0.01*_pi^2)/0.97 */
di sqrt(0.01*_pi^2)/0.97
set seed 54325821
clear all
corr2data u01 u02 u03 u04 u05 u06 u07 u08 u09 u010, n(30) ///
means(0 0 0 0 0 0 0 0 0 0) sds(.319 .319 .319 ///
.319 .319 .319 .319 .319 .319 .319)
gen sn=_n
reshape long u0 , i(sn) j(cluster)
drop sn
sort cluster u0
save randomerror.dta, replace
restore
sort cluster
merge cluster using randomerror
drop _merge
/* Probability of the outcome from the random effects logistic model */
gen prob=exp(x1*0.47+x2*0.3665+u0)/(1+exp(x1*0.47+x2*0.3665+u0))
/* Binary outcome from the Bernoulli distribution with probability prob */
gen u=runiform()
gen y = (u > prob)
What I don’t understand is why when I run the above syntax, and then the
command for multilevel logistic
xtlogit y i.x1 x2, i(cluster) quad(30) or
I get such a low sigma_u (approximately zero and thus a very low rho)
when I should be getting sigma_u=0.319
Any help on this would be greatly appreciated.
Many thanks
Georgia
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/