|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: permutations
On Mar 17, 2008, at 10:55 AM, Christopher Intemann wrote:
Hi Dave,
Am 17.03.2008 um 14:53 schrieb David Airey:
.
I calculated a test of a difference between any two SNP genotypes.
You are saying you want to know which pairwise comparisons are
different. If you just want those two probabilities (genotype 2 =
genotype 0, genotype 1 = genotype 0), you get these by adding a
couple lines to our code, and modifying the permute command as
below. To get the other comparison (genotype 2 = genotype 1), you
would need to use a different base comparison, or you would need to
use the command lincom.
***do_begin***
describe
table age caco snp
program plogistic, rclass
version 10
args caco snp age
xi: logistic `caco' i.`snp' i.`age'
test _Isnp_1 _Isnp_2
return scalar chi2 = r(chi2)
test _Isnp_1
return scalar chi2_1 = r(chi2)
test _Isnp_2
return scalar chi2_2 = r(chi2)
end
set seed 3132008
permute caco chi2=r(chi2) chi2_1=r(chi2_1) chi2_2=r(chi2_2), ///
reps(1000): plogistic caco snp age
***do_end***
Monte Carlo permutation results Number of obs
= 2394
command: plogistic caco snp age
chi2: r(chi2)
chi2_1: r(chi2_1)
chi2_2: r(chi2_2)
permute var: caco
------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95%
Conf. Interval]
-------------
+----------------------------------------------------------------
chi2 | .1560796 920 1000 0.9200 0.0086 .
9014202 .936058
chi2_1 | .1184664 724 1000 0.7240 0.0141 .
6951618 .7515127
chi2_2 | .1448991 719 1000 0.7190 0.0142 .
690024 .7466803
------------------------------------------------------------------------------
Note: confidence intervals are with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}
.
Thank very much you for your efforts.
Your solution does exactly what I was looking for; figured it out
already:-)
The only thing is that I now have some trouble obtaining the full p-
value.
<snip>
Monte Carlo permutation results Number of obs
= 4356
command: plogistic affected_stata snp age ethno_x sex_x
chi2_1: r(chi2_1)
chi2_2: r(chi2_2)
permute var: affected_stata
------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95% Conf.
Interval]
-------------
+----------------------------------------------------------------
chi2_1 | 10.44501 2 1000 0.0020 0.0014 .
0002423 .0072058
chi2_2 | 16.76662 0 1000 0.0000 0.0000
0 .0036821
------------------------------------------------------------------------------
Note: confidence intervals are with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}
</snip>
According to stata help, results are stored in several matrices.
To get the p-values, I'm using these command:
The p value is simply p=c/n in the tabled results. It is 0.002, and
0.000. Increase your number of permutations if you want something more
exact than 2/1000, or 0/1000. Try 10,000 permutations, instead of
1000. You should try to report an exact probability. This is
especially important if you are running many SNPs and later want to
apply a multiple test correction procedure. There, 1x10-6 or 1x10-7
matters a lot.
The seed is necessary for replication. If you don't set the seed you
will get chance variation in your results, because permute will be
using different random numbers; setting the seed makes permute use the
same random numbers. It will not affect your statistical conclusions.
Most people will seed the random number generator.
-Dave
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/