Andrew McIntosh posted:
--------------------------------------------------------------------------------
I (will) have a sample of 30 pairs of people and I want to calculate the
power to detect a difference of 5, 10, or 15 events between the two
(paired ) groups, assuming the control group have an event rate of about
50%.
Unforunately, the 'sampsi' command doesn't seem to extend to matched
pairs and I wondered if there was a method of completing this
analysis using another STATA command? Thank you for any help...
--------------------------------------------------------------------------------
Simulation might be worth considering here. Alan Feiveson wrote an article in
_Stata Journal_ last year (Volume 2, Issue 2) that might be of interest.
As a first-pass crude illustration of its potential here, I've appended a
rudimentary Monte Carlo do-file to this posting. It estimates power for two
tests, McNemar's test and random-effects probit regression, for a given level
of Type 1 error rate for a given correlation coefficient between the variables
for the sample size of 30 pairs of binary outcomes and a given delta of 5, 10
or 15 events. (Based upon Andrew's casting of the control group event rate as
a proportion, I'm assuming that the observation intervals or exposures are the
same for everyone.)
I've parameterized the underlying generating function in probit terms--see J.
S. Long, _Regression Models for Categorical and Limited Dependent Variables_
(Thousand Oaks, Calif.: Sage, 1997), pp. 40-50. This was because specifying
correlated binomial random variates seems feasible that way. (There is an
alternative, apparently easily implemented algorithm for generating correlated
binomial random variates by A. D. Lunn and S. J. Davies, A note on generated
correlated binary variables. _Biometrika_ 85:487-90, 1998.)
The do-file uses -simulate-. The called program expects options for the delta
(del: 5, 10 or 15 in Andrew's case), whether the likelihood ratio test for -
xtprobit- is desired (lrt), the tetrachoric correlation coefficient (cor) and
the level of the Type 1 error rate (sig). The first of the three returned
results (st_pos) is the number of statistically significant test results by
McNemar's test (which is equivalent to the sign test), and the second (lr_pos)
is the number of statistically significant test results for a likelihood ratio
test after -xtprobit-. The averages of each of these, as reported by -
summarize-, gives the power of each test under the conditions specified. The
third returned result (rho) is the estimate of the rho parameter as reported by
-xtprobit-.
Although the phenomenon being simulated doesn't need to be regarded as a
manifestation of a latent trait or latent variable, a tetrachoric correlation
coefficient needs to be specified in order to perform this do-file's
simulation. In the do-file below, I've spanned correlation coefficients from
50% through 70% to an improbable 90%. The number of repetitions is set to 400
in order to get an idea of the tests' performances sooner than with a large
number.
I've included a redacted output after the do-file. You can see that the test
size for the two tests when the delta (del) is set to zero, and their relative
power when del is 5 or 10. I've trapped cases from going on to -xtprobit-
where the observed proportion is zero or one for the test group, since maximum
likelihood methods don't converge reliably in such cases; this ended up
trapping all of the cases when the delta was 15 (expected proportions of zero
or one).
It is often stated that maximum likelihood methods are too far from asymptotic
conditions for use with small sample sizes, so -xtprobit- would seem a dicey
proposition with only 30 pairs of observations, but the test size--except in
the extreme-correlation case--is about 10%, which is what I set the level of
Type 1 error rate at here.
Joseph Coveney
--------------------------begin crude illustrative do-file----------------------
program define corbinsim, rclass
syntax , del(integer) [lrt cor(real 0.5) sig(real 0.05)]
local tes = invnorm( (15 + (2 * (uniform() > 0.5) - 1) ///
* `del') / 30 )
drawnorm pi0 pi1, means(0 `tes') sd(1 1) ///
corr(1 `cor' \ `cor' 1) n(30) clear
foreach var of varlist pi0 pi1 {
replace `var' = `var' > 0
}
signtest pi0 = pi1
return scalar st_pos = r(p_2) < `sig'
if "`lrt'" == "lrt" {
summarize pi1, meanonly
if r(mean) > 0 & r(mean) < 1 {
generate byte pid = _n
reshape long pi, i(pid) j(trt)
xtprobit pi trt, i(pid)
return scalar rho = e(rho)
estimates store A
xtprobit pi, i(pid)
lrtest A
return scalar lr_pos = r(p) < `sig'
}
else {
return scalar lr_pos = .
}
}
end
*
set more off
set seed 20030826
* power for Type 1 error rate of 10%
*
* 70% tetrachoric correlation coefficient
*
local lrt "lrt"
forvalues dif = 0(5)15 {
simulate "corbinsim, del(`dif') `lrt' cor(0.7) sig(0.10)" ///
st_pos = r(st_pos) lr_pos = r(lr_pos) rho = r(rho), reps(400)
display
display as result `dif'
summarize st_pos lr_pos rho
}
*
* 50% correlation coefficient
*
forvalues dif = 0(5)15 {
simulate "corbinsim, del(`dif') `lrt' cor(0.5) sig(0.10)" ///
st_pos = r(st_pos) lr_pos = r(lr_pos) rho = r(rho), reps(400)
display
display as result `dif'
summarize st_pos lr_pos rho
}
*
* 90% correlation coefficient
*
forvalues dif = 0(5)15 {
simulate "corbinsim, del(`dif') `lrt' cor(0.9) sig(0.10)" ///
st_pos = r(st_pos) lr_pos = r(lr_pos) rho = r(rho), reps(400)
display
display as result `dif'
summarize st_pos lr_pos rho
}
exit
------------------------end crude illustrative do-file-------------------------
--------------------begin edited output from Results screen---------------------
. * power for Type 1 error rate of 10%
. *
. * 70% tetrachoric correlation coefficient
command: corbinsim , del(0) lrt cor(0.7) sig(0.10)
0
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 398 .0351759 .184456 0 1
lr_pos | 398 .120603 .3260753 0 1
rho | 398 .6789901 .196932 8.32e-07 .995425
command: corbinsim , del(5) lrt cor(0.7) sig(0.10)
5
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 400 .39 .4883608 0 1
lr_pos | 400 .5975 .4910158 0 1
rho | 400 .6923904 .2084408 8.32e-07 .9955038
command: corbinsim , del(10) lrt cor(0.7) sig(0.10)
10
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 398 .9371859 .2429336 0 1
lr_pos | 397 .9823678 .1317767 0 1
rho | 397 .8071645 .265319 8.32e-07 .995355
command: corbinsim , del(15) lrt cor(0.7) sig(0.10)
15
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 400 1 0 1 1
lr_pos | 0
rho | 0
. * 50% coefficient
command: corbinsim , del(0) lrt cor(0.5) sig(0.10)
0
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 399 .0325815 .1777614 0 1
lr_pos | 399 .1027569 .3040223 0 1
rho | 399 .4854253 .2270576 8.32e-07 .9953463
command: corbinsim , del(5) lrt cor(0.5) sig(0.10)
5
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 394 .3527919 .4784462 0 1
lr_pos | 394 .5050761 .5006099 0 1
rho | 394 .5023774 .2358462 8.32e-07 .9937819
command: corbinsim , del(10) lrt cor(0.5) sig(0.10)
10
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 391 .9002558 .3000426 0 1
lr_pos | 391 .9667519 .1795134 0 1
rho | 391 .6246353 .3290946 8.32e-07 .9937818
command: corbinsim , del(15) lrt cor(0.5) sig(0.10)
15
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 400 1 0 1 1
lr_pos | 0
rho | 0
. * 90% coefficient
command: corbinsim , del(0) lrt cor(0.9) sig(0.10)
0
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 390 .0282051 .1657711 0 1
lr_pos | 390 .1794872 .3842527 0 1
rho | 390 .8963201 .097665 .3548582 .9972232
command: corbinsim , del(5) lrt cor(0.9) sig(0.10)
5
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 396 .5075758 .500575 0 1
lr_pos | 396 .8257576 .3797977 0 1
rho | 396 .9252776 .1090395 .4977363 .9956096
command: corbinsim , del(10) lrt cor(0.9) sig(0.10)
10
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 400 .985 .1217047 0 1
lr_pos | 399 .9974937 .0500626 0 1
rho | 399 .971989 .0688614 .3453863 .9954332
command: corbinsim , del(15) lrt cor(0.9) sig(0.10)
15
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
st_pos | 400 1 0 1 1
lr_pos | 0
rho | 0
----------------------------end edited output-----------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/