|  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: permutations
On Mar 17, 2008, at 10:55 AM, Christopher Intemann wrote:
Hi Dave,
Am 17.03.2008 um 14:53 schrieb David Airey:
.
I calculated a test of a difference between any two SNP genotypes.  
You are saying you want to know which pairwise comparisons are  
different. If you just want those two probabilities (genotype 2 =  
genotype 0, genotype 1 = genotype 0), you get these by adding a  
couple lines to our code, and modifying the permute command as  
below. To get the other comparison (genotype 2 = genotype 1), you  
would need to use a different base comparison, or you would need to  
use the command lincom.
***do_begin***
describe
table age caco snp
program plogistic, rclass
	version 10
	args caco snp age
	xi: logistic `caco' i.`snp' i.`age'
	test _Isnp_1 _Isnp_2
	return scalar chi2 = r(chi2)
	test _Isnp_1
	return scalar chi2_1 = r(chi2)
	test _Isnp_2
	return scalar chi2_2 = r(chi2)
end
set seed 3132008
permute caco chi2=r(chi2) chi2_1=r(chi2_1) chi2_2=r(chi2_2), ///
	reps(1000): plogistic caco snp age
***do_end***
Monte Carlo permutation results                   Number of obs    
=       2394
    command:  plogistic caco snp age
       chi2:  r(chi2)
     chi2_1:  r(chi2_1)
     chi2_2:  r(chi2_2)
permute var:  caco
------------------------------------------------------------------------------
T            |     T(obs)       c       n   p=c/n   SE(p) [95%  
Conf. Interval]
------------- 
+----------------------------------------------------------------
      chi2 |   .1560796     920    1000  0.9200  0.0086  . 
9014202    .936058
    chi2_1 |   .1184664     724    1000  0.7240  0.0141  . 
6951618   .7515127
    chi2_2 |   .1448991     719    1000  0.7190  0.0142   . 
690024   .7466803
------------------------------------------------------------------------------
Note:  confidence intervals are with respect to p=c/n.
Note:  c = #{|T| >= |T(obs)|}
.
Thank very much you for your efforts.
Your solution does exactly what I was looking for; figured it out  
already:-)
The only thing is that I now have some trouble obtaining the full p- 
value.
<snip>
Monte Carlo permutation results                   Number of obs    
=       4356
     command:  plogistic affected_stata snp age ethno_x sex_x
      chi2_1:  r(chi2_1)
      chi2_2:  r(chi2_2)
 permute var:  affected_stata
------------------------------------------------------------------------------
T            |     T(obs)       c       n   p=c/n   SE(p) [95% Conf.  
Interval]
------------- 
+----------------------------------------------------------------
     chi2_1 |   10.44501       2    1000  0.0020  0.0014  . 
0002423   .0072058
     chi2_2 |   16.76662       0    1000  0.0000  0.0000          
0   .0036821
------------------------------------------------------------------------------
Note:  confidence intervals are with respect to p=c/n.
Note:  c = #{|T| >= |T(obs)|}
</snip>
According to stata help, results are stored in several matrices.
To get the p-values, I'm using these command:
The p value is simply p=c/n in the tabled results. It is 0.002, and  
0.000. Increase your number of permutations if you want something more  
exact than 2/1000, or 0/1000. Try 10,000 permutations, instead of  
1000. You should try to report an exact probability. This is  
especially important if you are running many SNPs and later want to  
apply a multiple test correction procedure. There, 1x10-6 or 1x10-7  
matters a lot.
The seed is necessary for replication. If you don't set the seed you  
will get chance variation in your results, because permute will be  
using different random numbers; setting the seed makes permute use the  
same random numbers. It will not affect your statistical conclusions.  
Most people will seed the random number generator.
-Dave
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/