Dear all,
The following question is not entirely related to Stata, but I will be
really grateful if you could suggest any tip to answer it using Stata.
I am trying to figure out which analysis was done in a text book of
application of logistic regression in genetic studies. The aim is to know
which model was applied to be able to learn it.
According to the book, the model was fit by logistic regression using the
GLIM program. The GOF of each model was assessed by scaled deviance and
AIC.
I could only reproduce succesfully only the final result, since it is a
simple logistic regression model:
. xi: logistic status i.exposure [freq=count]
It reproduces exactly what is presented in the book:
Logistic regression Number of obs = 2112
LR chi2(2) = 6.77
Prob > chi2 = 0.0338 Log likelihood = -1460.1621
Pseudo R2 = 0.0023
----------------------------------------------------------------------
status | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+--------------------------------------------------------
_Iexposure_1 | 1.099477 .1666565 0.63 0.532 .8168884 1.4798
_Iexposure_2 | 1.347291 .2037921 1.97 0.049 1.001632 1.8122
----------------------------------------------------------------------
The objective of such analysis is to know which coding (genetic model)
fits better the data (coding cases_1\constrols_1 cases_2\constrols_2 and
cases_3\constrols_3 as 1, 0.5, and 0 or 1,0, and 1).
According to the book, the best model will be that that uses the coding 1,
0.5, and 0 (or 2,1 and 0), which produces the following results:
Scaled deviance df AIC
Constant 52.209 32 -11.791
Cons+coding 1,0.5,0 45.731 31 -16.269
The data is shown below.
Thank you for any help.
All the best,
Tiago
*----------- begin data -------------
input cases_1 cases_2 cases_3 controls_1 controls_2 controls_3 strata
74 44 23 51 75 13 1
26 23 4 23 30 8 2
59 42 10 49 42 9 3
34 36 6 27 22 4 4
37 41 7 28 35 15 5
35 33 8 37 40 9 6
33 23 10 38 46 13 7
48 35 8 48 35 7 8
54 45 8 50 40 8 9
62 66 9 59 57 9 10
57 69 7 33 56 20 11
end
rename cases_1 a0
rename cases_2 b0
rename cases_3 c0
rename controls_1 a1
rename controls_2 b1
rename controls_3 c1
reshape long a b c, i(strata) j(status)
gen id = _n
rename a count2
rename b count1
rename c count0
reshape long count, i(id) j(exposure)
replace status= cond(status==1,0,1)
*----------- end data --------------
--
Tiago V. Pereira
Heart Institute - InCor
Laboratory of Genetics and Molecular Cardiology,
Department of Biochemistry and Molecular Biology,
Federal University of S�o Paulo.
Av. Dr. Eneas de Carvalho Aguiar, 44;
Cerqueira C�sar - CEP 05403-000
Sao Paulo, SP Brazil.
Tel./fax: +55 11 3069 5068
email: [email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/