I may be missing details here, but I doubt
you need -file-.
It's not clear if these genotypes are
held as strings or as numbers with value labels.
Let's skirt round that.
If the genotypes are the same then -gene2- to -gene4-
are all equal to -gene1-, but not otherwise.
Thus
gen different = 0
forval j = 2/4 {
replace different = different + (gene`j' == gene1)
}
outfile study_id gene? if different
Nick
[email protected]
Richard Aplenc
> I am working with a dataset that has 4 genotypes (gene1,
> gene2, gene3,
> gene4) and each genotype may have discordant values for a particular
> study subject. i.e.
>
> study_id gene1 gene2 gene3 gene4
> a-1 ww ww ww vv
> a-1 vv ww ww vv
> b-1 ww vv ww vv
> c-1 ww ww vv vv
> c-1 ww ww vv vv
>
>
> The statalist (ie Nick) had previously helped me identify
> genotypes that
> are discordant for one gene for one individual. Now I would like to
> write only those discordant genotypes out to a .txt file so
> that I can
> take it back to the lab. i.e.
>
> study_id gene genotype
> a-1 gene1 ww
> a-1 gene1 vv
>
> I can do this using a temporary file (see below), but I suspect that
> using a temporary file is not the most appropriate/elegent way to do
> this. I'd be most appreciate of anyone's education efforts on my
> behalf. If you go to AACR or ASH, I'll buy you a beer/coffee
> there. :)
>
> Thanks in advance.
>
> local genotypes "gene1 gene2 gene3 gene4"
> file open discordant using discordant.txt, write replace
> file write discordant
> foreach gene of local genotypes {
> preserve
> tempfile data
> save "`data'"
> keep if disc_`gene' == 1
> file write discordant "Study ID" _tab "Gene" _tab "Genotype" _n
> local N = _N
> sort study_id `gene'
> forvalues i = 1/`N' {
> file write discordant (study_id[`i']) _tab "`gene'" _tab
> (`gene'[`i']) _n
> }
> restore
> }
> file close discordant
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/