Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: SV: RE: risk ratio

From	"Tomas Lind" <[email protected]>
To	<[email protected]>
Subject	st: SV: RE: risk ratio
Date	Mon, 22 Mar 2010 16:36:16 +0100
Hi Joseph,

Thanks for your kind response. 

I am working with Stata v10 (but we are going to upgrade to v11 soon). You
find my code below to generate data according to a logit model.

In the first example I generate data with an odds-model. The
beta-coefficient used to generate data is 0.49. 

When analyzing these data with logistic regression I get my beta-coefficient
(0.50 in this run). When analyzing data with a Poisson-model, beta is
estimated to 0.069. I suppose this is because a Poisson-model is measuring
RR not OR.

clear *
set obs 100000			
* Expo is logNf  mean=16,2 sd=8,2 
gen pm10 = exp(2.66 + 0.49 * invnorm(uniform()))		
generate z=(-11.2 + (0.5*(pm10) ))   

generate p_case=(1/(1+exp(-z)))		// p_case=0.2

generate case=0		
replace case=1 if(uniform()<p_case & p_case !=.)

glm fall pm10 , link(logit) fam(bin)		// beta = 0.50
glm fall pm10 , link(log)   fam(poi)  robust	// beta = 0,069



In example 2 I generate data with the -genbinomial- with a log link to
generate data where exposure is proportional to risk. In this case Poisson
regression gives me the correct beta but the logistic regression does not. 

clear *
set obs 200000
gen x1 = invnorm(uniform())
gen x2 = invnorm(uniform())
gen xb = -1 + 0.5*x1 + 1.5*x2
genbinomial y, xbeta(xb) n(1)  link(log)		// link(LOG)

rename y  case
// genbinomial might give values outside 0, 1. 

drop  if case==.				// p(case)=0.3

glm case x1 x2 , link(log) fam(po) vce(robust) // OK beta1=0,50   beta2=1,51
glm case x1 x2 , link(logit) fam(bin) 	// Wrong beta1=0,82   beta2=2,44



Yours

Tomas





-----Ursprungligt meddelande-----
Från: [email protected]
[mailto:[email protected]] För [email protected]
Skickat: den 21 mars 2010 15:52
Till: [email protected]
Ämne: st: RE: risk ratio

In response to the StataLister asking about creating a synthetic binary 
response model that
can be used to estimate a relative risk ratio:

I have an article coming out in the next Stata Journal that details how 
to create synthetic models for a wide  variety
of discrete response regression models. For your problem though, I 
think that the best approach is to create a synthetic
binary logistic model with a single predictor - as you specified. Then 
model the otherwise logistic data as
Poisson with a robust variance estimator. And the coefficient must be 
exponentiated. It can be interpreted as a relative
risk ratio.

Below is code to create a simple binary logistic model. Then model as 
mentioned above. You asked for a
continuous pseudo-random variate, so I generated it from a normal 
distribution. I normally like to use pseudo-random
uniform variates rather normal variates when creating these types of 
models, but it usually makes little difference.
Recall that without a seed the model results will differ each time run. 
If you want the same results, pick a seed. I used my birthday.

I hope that this is what you were looking for.

Joseph Hilbe

*  intercept = 2;  Beta for X1=0.75
clear
set obs 50000
set seed 1230
gen x1 = invnorm(runiform())
gen xb = 2 + 0.75*x1
gen exb = 1/(1+exp(-xb))
gen by = rbinomial(1, exb)
glm by x1, nolog fam(bin 1)
glm by x1, nolog fam(poi) eform robust


. glm by x1, nolog fam(bin 1)
Generalized linear models                          No. of obs      = 
50000
Optimization     : ML                              Residual df      = 
49998
                                                  Scale parameter = 1
Deviance         =  37672.75548                    (1/df) Deviance = 
.7534852
Pearson          =  49970.46961                    (1/df) Pearson  = 
.9994494
Variance function: V(u) = u*(1-u)                  [Bernoulli]
Link function    : g(u) = ln(u/(1-u))              [Logit]
                                                   AIC             = 
.7535351
Log likelihood   = -18836.37774                    BIC             = 
-503294.5
-------------------------------------------------------------------------
-----
            |                 OIM
          by |      Coef.   Std. Err.      z    P>|z|     [95% Conf.  
Interval]
-------------+-----------------------------------------------------------
-----
          x1 |   .7534291   .0143134    52.64   0.000     .7253754  
.7814828
       _cons |   1.993125   .0149177   133.61   0.000     1.963887   
2.022363
-------------------------------------------------------------------------
-----


. glm by x1, nolog fam(poi) eform robust
Generalized linear models                          No. of obs      
=50000
Optimization     : ML                              Residual df     = 
49998
                                                  Scale parameter = 1
Deviance         =  12673.60491                    (1/df) Deviance = 
.2534822
Pearson          =   7059.65518                    (1/df) Pearson  = 
.1411988
Variance function: V(u) = u                        [Poisson]
Link function    : g(u) = ln(u)                    [Log]
                                                   AIC             = 
1.970592
Log pseudolikelihood = -49262.80246                BIC             = 
-528293.7
-------------------------------------------------------------------------
-----
            |               Robust
          by |        IRR   Std. Err.      z    P>|z|     [95% Conf. 
Interval]
-------------+-----------------------------------------------------------
-----
          x1 |   1.104476   .0021613    50.78   0.000     1.100248 
1.10872
-------------------------------------------------------------------------
-----
.
Tomas Lind wrote:
Does anyone know how to generate fake data for a dichotomous outcome 
(0, 1)
that is dependent on a continuous exposure variable in an 
epidemiological
relative risk context. I know how to use the logit transformation but in
that case exposure is proportional to log(ods) and not to risk.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
References:
- st: RE: risk ratio
  - From: [email protected]
Prev by Date: st: AW: RE: re: question on XTOVERID
Next by Date: st: RE: triprobit convergence problem
Previous by thread: st: RE: risk ratio
Next by thread: st: mlogit iterations not concave
Index(es):
- Date
- Thread