Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Tomas Lind" <tomas.lind@ki.se> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: SV: RE: risk ratio |
Date | Mon, 22 Mar 2010 16:36:16 +0100 |
Hi Joseph, Thanks for your kind response. I am working with Stata v10 (but we are going to upgrade to v11 soon). You find my code below to generate data according to a logit model. In the first example I generate data with an odds-model. The beta-coefficient used to generate data is 0.49. When analyzing these data with logistic regression I get my beta-coefficient (0.50 in this run). When analyzing data with a Poisson-model, beta is estimated to 0.069. I suppose this is because a Poisson-model is measuring RR not OR. clear * set obs 100000 * Expo is logNf mean=16,2 sd=8,2 gen pm10 = exp(2.66 + 0.49 * invnorm(uniform())) generate z=(-11.2 + (0.5*(pm10) )) generate p_case=(1/(1+exp(-z))) // p_case=0.2 generate case=0 replace case=1 if(uniform()<p_case & p_case !=.) glm fall pm10 , link(logit) fam(bin) // beta = 0.50 glm fall pm10 , link(log) fam(poi) robust // beta = 0,069 In example 2 I generate data with the -genbinomial- with a log link to generate data where exposure is proportional to risk. In this case Poisson regression gives me the correct beta but the logistic regression does not. clear * set obs 200000 gen x1 = invnorm(uniform()) gen x2 = invnorm(uniform()) gen xb = -1 + 0.5*x1 + 1.5*x2 genbinomial y, xbeta(xb) n(1) link(log) // link(LOG) rename y case // genbinomial might give values outside 0, 1. drop if case==. // p(case)=0.3 glm case x1 x2 , link(log) fam(po) vce(robust) // OK beta1=0,50 beta2=1,51 glm case x1 x2 , link(logit) fam(bin) // Wrong beta1=0,82 beta2=2,44 Yours Tomas -----Ursprungligt meddelande----- Från: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] För jhilbe@aol.com Skickat: den 21 mars 2010 15:52 Till: statalist@hsphsun2.harvard.edu Ämne: st: RE: risk ratio In response to the StataLister asking about creating a synthetic binary response model that can be used to estimate a relative risk ratio: I have an article coming out in the next Stata Journal that details how to create synthetic models for a wide variety of discrete response regression models. For your problem though, I think that the best approach is to create a synthetic binary logistic model with a single predictor - as you specified. Then model the otherwise logistic data as Poisson with a robust variance estimator. And the coefficient must be exponentiated. It can be interpreted as a relative risk ratio. Below is code to create a simple binary logistic model. Then model as mentioned above. You asked for a continuous pseudo-random variate, so I generated it from a normal distribution. I normally like to use pseudo-random uniform variates rather normal variates when creating these types of models, but it usually makes little difference. Recall that without a seed the model results will differ each time run. If you want the same results, pick a seed. I used my birthday. I hope that this is what you were looking for. Joseph Hilbe * intercept = 2; Beta for X1=0.75 clear set obs 50000 set seed 1230 gen x1 = invnorm(runiform()) gen xb = 2 + 0.75*x1 gen exb = 1/(1+exp(-xb)) gen by = rbinomial(1, exb) glm by x1, nolog fam(bin 1) glm by x1, nolog fam(poi) eform robust . glm by x1, nolog fam(bin 1) Generalized linear models No. of obs = 50000 Optimization : ML Residual df = 49998 Scale parameter = 1 Deviance = 37672.75548 (1/df) Deviance = .7534852 Pearson = 49970.46961 (1/df) Pearson = .9994494 Variance function: V(u) = u*(1-u) [Bernoulli] Link function : g(u) = ln(u/(1-u)) [Logit] AIC = .7535351 Log likelihood = -18836.37774 BIC = -503294.5 ------------------------------------------------------------------------- ----- | OIM by | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+----------------------------------------------------------- ----- x1 | .7534291 .0143134 52.64 0.000 .7253754 .7814828 _cons | 1.993125 .0149177 133.61 0.000 1.963887 2.022363 ------------------------------------------------------------------------- ----- . glm by x1, nolog fam(poi) eform robust Generalized linear models No. of obs =50000 Optimization : ML Residual df = 49998 Scale parameter = 1 Deviance = 12673.60491 (1/df) Deviance = .2534822 Pearson = 7059.65518 (1/df) Pearson = .1411988 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] AIC = 1.970592 Log pseudolikelihood = -49262.80246 BIC = -528293.7 ------------------------------------------------------------------------- ----- | Robust by | IRR Std. Err. z P>|z| [95% Conf. Interval] -------------+----------------------------------------------------------- ----- x1 | 1.104476 .0021613 50.78 0.000 1.100248 1.10872 ------------------------------------------------------------------------- ----- . Tomas Lind wrote: Does anyone know how to generate fake data for a dichotomous outcome (0, 1) that is dependent on a continuous exposure variable in an epidemiological relative risk context. I know how to use the logit transformation but in that case exposure is proportional to log(ods) and not to risk. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/