Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: risk ratio

From	Joseph Hilbe <[email protected]>
To	[email protected]
Subject	st: RE: risk ratio
Date	Sat, 20 Mar 2010 13:01:47 -0700

I have an article coming out in the next Stata Journal that details
how to create synthetic models for a wide  variety
of discrete response regression models. For your problem though, I
think that the best approach is to create a synthetic
binary logistic model with a single predictor - as you specified. Then
model the otherwise logistic data as
Poisson with a robust variance estimator. And the coefficient must be
exponentiated. It can be interpreted as
a relative risk ratio.

Below is code to create a simple binary logistic model. Then model as
mentioned above. You asked for a
continuous pseudo-random variate, so I generated it from a normal
distribution. I normally like to use pseudo-random
uniform variates rather normal variates when creating these types of
models, but it usually makes little difference.
Recall that without a seed the model results will differ each time
run. If you want the same results, pick a
seed. I used my birthday.

I hope that this is what you were looking for.

Joseph Hilbe

clear
set obs 50000
set seed 1230
gen x1 = invnorm(runiform())
gen xb = 2 + 0.75*x1
gen exb = 1/(1+exp(-xb))
gen by = rbinomial(1, exb)
glm by x1, nolog fam(bin 1)
glm by x1, nolog fam(poi) eform robust


. glm by x1, nolog fam(bin 1)
Generalized linear models                          No. of obs      =     50000
Optimization     : ML                              Residual df     =     49998
                                                   Scale parameter =         1
Deviance         =  37672.75548                    (1/df) Deviance =  .7534852
Pearson          =  49970.46961                    (1/df) Pearson  =  .9994494
Variance function: V(u) = u*(1-u)                  [Bernoulli]
Link function    : g(u) = ln(u/(1-u))              [Logit]
                                                   AIC             =  .7535351
Log likelihood   = -18836.37774                    BIC             = -503294.5
------------------------------------------------------------------------------
             |                 OIM
          by |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .7534291   .0143134    52.64   0.000     .7253754    .7814828
       _cons |   1.993125   .0149177   133.61   0.000     1.963887    2.022363
------------------------------------------------------------------------------


. glm by x1, nolog fam(poi) eform robust
Generalized linear models                          No. of obs      =     50000
Optimization     : ML                              Residual df     =     49998
                                                   Scale parameter =         1
Deviance         =  12673.60491                    (1/df) Deviance =  .2534822
Pearson          =   7059.65518                    (1/df) Pearson  =  .1411988
Variance function: V(u) = u                        [Poisson]
Link function    : g(u) = ln(u)                    [Log]
                                                   AIC             =  1.970592
Log pseudolikelihood = -49262.80246                BIC             = -528293.7
------------------------------------------------------------------------------
             |               Robust
          by |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   1.104476   .0021613    50.78   0.000     1.100248     1.10872
------------------------------------------------------------------------------
.

Tomas Lind wrote:
Does anyone know how to generate fake data for a dichotomous outcome (0, 1)
that is dependent on a continuous exposure variable in an epidemiological
relative risk context. I know how to use the logit transformation but in
that case exposure is proportional to log(ods) and not to risk.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: RE: risk ratio
  - From: "Nick Cox" <[email protected]>

Prev by Date: Re: st: RE: [Mata] naming matrices in a loop
Next by Date: st: ssm with hierarchical data
Previous by thread: st: How do I impose the constraint that rho is a given number (eg 0.2) using biprobit?
Next by thread: st: RE: RE: risk ratio
Index(es):
- Date
- Thread