Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: risk ratio
From
[email protected]
To
[email protected]
Subject
st: RE: risk ratio
Date
Sun, 21 Mar 2010 10:51:47 -0400
In response to the StataLister asking about creating a synthetic binary
response model that
can be used to estimate a relative risk ratio:
I have an article coming out in the next Stata Journal that details how
to create synthetic models for a wide variety
of discrete response regression models. For your problem though, I
think that the best approach is to create a synthetic
binary logistic model with a single predictor - as you specified. Then
model the otherwise logistic data as
Poisson with a robust variance estimator. And the coefficient must be
exponentiated. It can be interpreted as a relative
risk ratio.
Below is code to create a simple binary logistic model. Then model as
mentioned above. You asked for a
continuous pseudo-random variate, so I generated it from a normal
distribution. I normally like to use pseudo-random
uniform variates rather normal variates when creating these types of
models, but it usually makes little difference.
Recall that without a seed the model results will differ each time run.
If you want the same results, pick a seed. I used my birthday.
I hope that this is what you were looking for.
Joseph Hilbe
* intercept = 2; Beta for X1=0.75
clear
set obs 50000
set seed 1230
gen x1 = invnorm(runiform())
gen xb = 2 + 0.75*x1
gen exb = 1/(1+exp(-xb))
gen by = rbinomial(1, exb)
glm by x1, nolog fam(bin 1)
glm by x1, nolog fam(poi) eform robust
. glm by x1, nolog fam(bin 1)
Generalized linear models No. of obs =
50000
Optimization : ML Residual df =
49998
Scale parameter = 1
Deviance = 37672.75548 (1/df) Deviance =
.7534852
Pearson = 49970.46961 (1/df) Pearson =
.9994494
Variance function: V(u) = u*(1-u) [Bernoulli]
Link function : g(u) = ln(u/(1-u)) [Logit]
AIC =
.7535351
Log likelihood = -18836.37774 BIC =
-503294.5
-------------------------------------------------------------------------
-----
| OIM
by | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+-----------------------------------------------------------
-----
x1 | .7534291 .0143134 52.64 0.000 .7253754
.7814828
_cons | 1.993125 .0149177 133.61 0.000 1.963887
2.022363
-------------------------------------------------------------------------
-----
. glm by x1, nolog fam(poi) eform robust
Generalized linear models No. of obs
=50000
Optimization : ML Residual df =
49998
Scale parameter = 1
Deviance = 12673.60491 (1/df) Deviance =
.2534822
Pearson = 7059.65518 (1/df) Pearson =
.1411988
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC =
1.970592
Log pseudolikelihood = -49262.80246 BIC =
-528293.7
-------------------------------------------------------------------------
-----
| Robust
by | IRR Std. Err. z P>|z| [95% Conf.
Interval]
-------------+-----------------------------------------------------------
-----
x1 | 1.104476 .0021613 50.78 0.000 1.100248
1.10872
-------------------------------------------------------------------------
-----
.
Tomas Lind wrote:
Does anyone know how to generate fake data for a dichotomous outcome
(0, 1)
that is dependent on a continuous exposure variable in an
epidemiological
relative risk context. I know how to use the logit transformation but in
that case exposure is proportional to log(ods) and not to risk.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/