|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: permutations
.
Here's an example using Stata 9/10 syntax. With permutations, you have
to be careful about how the data are permuted. Sometimes they have to
be permuted in a way that respects the structure (strata) of the data.
Read texts by Good to familiarize with these issues. Here I'm ignoring
any issues like that.
Stata makes the general permutations very easy to do. You first define
a program to grab a statistic associated with the test you want, and
then you use that program in a permute command.
***do_begin***
describe
table age caco snp
program plogistic, rclass
version 10
args caco snp age
xi: logistic `caco' i.`snp' i.`age'
test _Isnp_1 _Isnp_2
return scalar chi2 = r(chi2)
end
set seed 3132008
permute caco chi2=r(chi2), reps(1000): plogistic caco snp age
***do_end***
results pasted:
. describe
Contains data from js_person_1.dta
obs: 2,394
vars: 3 10 Dec 2007 10:56
size: 40,698 (99.7% of memory free)
--------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------
caco byte %8.0g caco Group
age float %9.0g RECODE of age_group
snp float %9.0g C
--------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved
. table age caco snp
--------------------------------------------------------------------
| C and Group
RECODE of | ------- 0 ------ ------- 1 ------ ------- 2 ------
age_group | Control Case Control Case Control Case
----------+---------------------------------------------------------
35 | 39 24 90 75 90 57
40 | 37 48 103 119 92 103
45 | 30 42 109 111 80 88
50 | 21 33 70 61 42 61
55 | 22 12 65 58 43 31
60 | 29 21 52 49 59 50
--------------------------------------------------------------------
.
. program plogistic, rclass
1. version 10
2. args caco snp age
3. xi: logistic `caco' i.`snp' i.`age'
4. test _Isnp_1 _Isnp_2
5. return scalar chi2 = r(chi2)
6. end
.
. set seed 3132008
.
. permute caco chi2=r(chi2), reps(1000): plogistic caco snp age
(running plogistic on estimation sample)
Permutation replications (1000)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................................. 150
.................................................. 200
.................................................. 250
.................................................. 300
.................................................. 350
.................................................. 400
.................................................. 450
.................................................. 500
.................................................. 550
.................................................. 600
.................................................. 650
.................................................. 700
.................................................. 750
.................................................. 800
.................................................. 850
.................................................. 900
.................................................. 950
.................................................. 1000
Monte Carlo permutation results Number of obs
= 2394
command: plogistic caco snp age
chi2: r(chi2)
permute var: caco
------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95% Conf.
Interval]
-------------
+----------------------------------------------------------------
chi2 | .1560796 920 1000 0.9200 0.0086 .
9014202 .936058
------------------------------------------------------------------------------
Note: confidence interval is with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}
.
.
.
end of do-file
The p values for the genotype test was 0.9249, not much different.
. xi: logistic caco i.snp i.age
i.snp _Isnp_0-2 (naturally coded; _Isnp_0 omitted)
i.age _Iage_35-60 (naturally coded; _Iage_35
omitted)
Logistic regression Number of obs
= 2116
LR chi2(7)
= 21.17
Prob > chi2
= 0.0035
Log likelihood = -1455.9021 Pseudo R2
= 0.0072
------------------------------------------------------------------------------
caco | Odds Ratio Std. Err. z P>|z| [95% Conf.
Interval]
-------------
+----------------------------------------------------------------
_Isnp_1 | .9580097 .1193996 -0.34 0.731 .7503816
1.223088
_Isnp_2 | .9524632 .1218649 -0.38 0.703 .7412068
1.223931
_Iage_40 | 1.63372 .2251451 3.56 0.000
1.247017 2.14034
_Iage_45 | 1.545441 .216941 3.10 0.002 1.173722
2.034885
_Iage_50 | 1.634428 .2582141 3.11 0.002 1.199194
2.227627
_Iage_55 | 1.09121 .184666 0.52 0.606 .7831755
1.520399
_Iage_60 | 1.202247 .1956237 1.13 0.258 .8739581
1.653854
------------------------------------------------------------------------------
. test _Isnp_1 _Isnp_2
( 1) _Isnp_1 = 0
( 2) _Isnp_2 = 0
chi2( 2) = 0.16
Prob > chi2 = 0.9249
On Mar 14, 2008, at 5:44 AM, Christopher Intemann wrote:
Hi Dave
Am 13.03.2008 um 15:24 schrieb David Airey:
.
Do you have access to the manual help or online help? The syntax is:
permute permvar exp_list [, options] : command
You specified permvar but not exp_list, so I'm not sure what
permute is defaulting to, maybe just the overall model statistic?
You can write a small program to return the coefficients for the
genetic dummy variables, and an overall test of those, that gets
called by permute. The ANOVA example in the printed help shows how
to do this.
T is your statistic
T(obs) is your observed statistic
c is the number of times the statistics is greater than T(obs)
n is the default number of permutations since you didn't specify
this either
Below is an example from ATS UCLA (Stata version 9/10):
* read your data into Stata
use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear
generate goodread = (read > 60)
* run logistic predicting "goodread" from "write"
logit goodread female
* run permutation test for above with 10000 repetitions
permute female x2=e(chi2), reps(10000) nodots: logit goodread female
Since you want a statistic for the individual dummy variables and
the test for both genetic dummy variables, you don't just want the
results from the overall test, and you cannot just collect that
using permute as the above example does.
I was using a pre-stata9 syntax of the permute command.
However, a command like
xi:permute affected_stata x2=e(chi2):logistic affected_stata i.gene
age ethno_x sex_x
does calculate some permutations, but is pretty sure not what I
want:-(
As i could calculate my p-values from the matrices e(b) and e(V),
which are generated by logistic, i would rather like to use these
in the exp_list. However, this does not work:
"matrix operators that return matrices not allowed in this context"
Is there any other way to permute or calculate the p-values?
Just similar to what other programs for genetic analysis (like
unphased, plink or haploview) do?
I just came along cvpermute, maybe this could do the trick?
Many thanks in advance,
Chris
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
David C. Airey, Ph.D.
Pharmacology Research Assistant Professor
Center for Human Genetics Research Member
Department of Pharmacology
School of Medicine
Vanderbilt University
Rm 8158A Bldg MR3
465 21st Avenue South
Nashville, TN 37232-8548
TEL (615) 936-1510
FAX (615) 936-3747
EMAIL [email protected]
URL http://people.vanderbilt.edu/~david.c.airey/dca_cv.pdf
URL http://www.vanderbilt.edu/pharmacology
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/