|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: sample size calculation for binary repeated measures
From |
Laura Gibbons <[email protected]> |
To |
Statalist <[email protected]> |
Subject |
Re: st: sample size calculation for binary repeated measures |
Date |
Mon, 17 Sep 2007 10:40:16 -0700 (PDT) |
Thank you, Joseph, this code was very timely as I am just setting out to
do something similar today.
I had to also install "ridder" to get this to work, in addition to ovbd.
It's called by ovbd.
Many thanks,
Laura
On Sun, 16 Sep 2007, Joseph Coveney wrote:
Louis Apicella wrote:
I am trying to determine the sample size for a binary repeated measures
study in which we'll collect baseline data from two groups, intervention and
control, then expose the intervention group, followed by collecting data
twice more post-intervention from both groups.
I've been using this code to come up with sample sizes
sampsi .45 .55, sd1(.5) sd2(.5) power(.8) method(ancova) pre(1) post(2)
r1(.5) r01(.5)
My question is that I have only proportions to work with and I'm not sure
that sampsi is designed to use proportions when we specify sd1() and sd2().
I also feel that the variance should be closer to .1 for each sample but
then my sample size is ridiculously small (n=8). Essentially, what I want
to be able to do is a comparison of the change in the intervention group to
the change in the control group. Any ideas?
--------------------------------------------------------------------------------
From the dialogue screens, it seems that -sampsi- only works with continuous
data for repeated measurements. The -sd1() sd2()- options wouldn't be
invoked when using -sampsi- with proportions.
There might very well be an analytical method in the literature for power
analysis for your situation; Roger Newson has already mentioned one
approach. But have you considered using simulation as another? See
www.stata.com/support/faqs/stat/power.html and
www.stata-press.com/journals/sjabstracts/st0010.pdf for some contributions
by Alan Feiveson on the topic.
If an analytical method that fits the bill is readily available and facile,
then Monte Carlo simulation won't be attractive, but the latter is always
doable and often competitive with a trip to the library (and perhaps even
with JSTOR) with respect to overall time spent. In your case, you can
assess sensitivity toward assumptions regarding correlation, check out link
functions, and explore population-average verus subject-specific modeling
approaches, as well. See the do-file below for an example of link function
exploration using GEE with your ANCOVA-like set up and assumptions for
proportions and correlations; it displays power ("Mean" in -summarize-'s
output) for the canonical link function ("A") and identity link function
("B") as it climbs through a series of three candidate treatment group
sample sizes under the alternative hypothesis, and confirms the Type I error
rate (displaying it in the same way as it does power) using the smallest of
the probe per-group sample sizes under the null hypothesis. (The
illustration do-file requires a user-written command, -ovbd-, that you can
download from the SSC archive.)
With a subject-specific model (-xtlogit , re- or -gllamm-), the simulation
approach will demand increasing patience with increasing integration points.
(It's not included below because the program as originally written
used -xtlogit , re-, but you can often take advantage of -contract- and
[fweight =] to speed things up.) Although I took a chance in the do-file
below (especially with Model B), putting a limit on the number of iterations
is a good idea when using an iterative analytical method in the simulation
program.
Note that the illustration do-file and output shown below shouldn't be
construed as advocating a risk difference (identity link) model, but if the
observed proportions are all in the neighborhood of 50% as you expect, then
you might be lucky enough to get away with it.
Joseph Coveney
clear *
set more off
set seed `=date("2007-09-16", "YMD")'
*
capture program drop binsimem
program define binsimem, rclass
version 10
syntax , n(integer) [Corr(real 0.5) NUll]
tempname Control Experimental Correlation
tempfile tmpfil0
if ("`null'" == "") {
matrix define `Control' = (0.5, 0.55, 0.55)
matrix define `Experimental' = (0.5, 0.45, 0.45)
}
else {
matrix define `Control' = (0.5, 0.5, 0.5)
matrix define `Experimental' = (0.5, 0.5, 0.5)
}
matrix define `Correlation' = J(3, 3, `corr') + ///
I(3) * (1 - `corr')
drop _all
ovbd bas res0 res1, means(`Control') ///
corr(`Correlation') n(`n') clear
generate byte trt = 0
save `tmpfil0'
drop _all
ovbd bas res0 res1, means(`Experimental') ///
corr(`Correlation') n(`n') clear
generate byte trt = 1
append using `tmpfil0'
erase `tmpfil0'
generate int pid = _n
reshape long res, i(pid) j(tim)
generate byte bia = trt * bas
generate byte tia = trt * tim
xtgee res bas trt tim bia tia, i(pid) t(tim) ///
family(binomial) link(logit) corr(exchangeable)
test trt bia tia
return scalar A_p = r(p)
/* xtlogit res bas trt tim bia tia, i(pid) re ///
intmethod(mvaghermite) intpoints(30) */
xtgee res bas trt tim bia tia, i(pid) t(tim) ///
family(binomial) link(identity) corr(exchangeable)
test trt bia tia
return scalar B_p = r(p)
end
*
forvalues n = 250(50)350 {
quietly simulate A_p = r(A_p) ///
B_p = r(B_p), reps(1000): binsimem , n(`n')
display in smcl as text _newline(1) "n = " as result %3.0f `n'
capture noisily assert !missing(A_p) & !missing(B_p)
generate byte A_pos = (A_p < 0.05)
generate byte B_pos = (B_p < 0.05)
summarize *_pos
}
local n 250
quietly simulate A_p = r(A_p) B_p = r(B_p), ///
reps(1000): binsimem , n(`n') null
display in smcl as text _newline(1) "NULL n = " as result %3.0f `n'
capture noisily assert !missing(A_p) & !missing(B_p)
generate byte A_pos = (A_p < 0.05)
generate byte B_pos = (B_p < 0.05)
summarize *_pos
exit
Edited results:
n = 250
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
A_pos | 1000 .754 .430894 0 1
B_pos | 1000 .768 .4223202 0 1
n = 300
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
A_pos | 1000 .825 .3801572 0 1
B_pos | 1000 .833 .3731625 0 1
n = 350
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
A_pos | 1000 .877 .3286016 0 1
B_pos | 1000 .884 .3203852 0 1
[Snip]
NULL n = 250
[Snip]
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
A_pos | 1000 .046 .2095899 0 1
B_pos | 1000 .051 .2201078 0 1
. exit
end of do-file
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Laura E. Gibbons, PhD
General Internal Medicine, University of Washington
Box 359780
Harborview Medical Center
325 Ninth Avenue, Seattle, WA 98104
phone: 206-744-1842 fax: 206-744-9917
Office address: 401 Broadway, Suite 5122
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/