Dear All,
I have 50 data sets (d1, d2, ..., d50) like the one given below but
much larger. I would like to calculate the prevalence of protein for
each data set by genotype (AA and SS). Then put the results together
in a single file. Can this be done?
DATA SET 1(d1)
id age genotype protein
31 11 AA 1
40 11 SS 0
71 11 AA 1
74 11 AA 0
88 11 AA 0
98 11 AA 0
110 11 SS 1
The first two observation in the RESULTS file should look some thing like this:
age genotype prevalence
11 AA 0.4
11 SS 0.5
Thanks in advance
Raphael
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/