Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: conditional SE of y|X in glm |
Date | Tue, 24 Apr 2012 12:29:20 +0100 |
With this model, as with every other, you have to decide what you mean by "prediction", i.e. on what scale you are predicting. Also, I did write "I like to have such measures accessible for comparing -glm- results with those of other models in which rmse appears naturally." and I think logit models are stretching the point. In essence, what -glmcorr- does in your example is either wrong or irrelevant, depending on your point of view. -glmcorr- can be reconciled with those results by doing instead . gen fraction = r/n . glm fraction ldose , link(logit) Iteration 0: log likelihood = 3.345982 Iteration 1: log likelihood = 3.7166249 Iteration 2: log likelihood = 3.7245648 Iteration 3: log likelihood = 3.724566 Iteration 4: log likelihood = 3.724566 Generalized linear models No. of obs = 24 Optimization : ML Residual df = 22 Scale parameter = .0468293 Deviance = 1.030244611 (1/df) Deviance = .0468293 Pearson = 1.030244611 (1/df) Pearson = .0468293 Variance function: V(u) = 1 [Gaussian] Link function : g(u) = ln(u/(1-u)) [Logit] AIC = -.1437138 Log likelihood = 3.724566043 BIC = -68.88694 ------------------------------------------------------------------------------ | OIM fraction | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ldose | 22.43087 5.627079 3.99 0.000 11.402 33.45974 _cons | -40.34087 10.10823 -3.99 0.000 -60.15264 -20.52909 ------------------------------------------------------------------------------ . glmcorr fraction and predicted Correlation 0.800 R-squared 0.640 Root MSE 0.216 . di sqrt(e(dispers)) .21640079 However, that would lose some of the information in the data. Otherwise, -glmcorr- uses what -predict- produces by default; if that's wrong for your problem, so will the results be. Nick n.j.cox@durham.ac.uk -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Marco Ventura Sent: 24 April 2012 10:29 To: statalist@hsphsun2.harvard.edu Subject: Re: st: conditional SE of y|X in glm Dear Nick, thank you very much of your quick replies. Unfortunately there is something I still do not understand. If I do use http://www.stata-press.com/data/r10/beetle glm r ldose, fam(bin n) link (logit) di sqrt(e(dispers)) glmcorr I get two very different values 4.065 against 13.179. Which of the two is correct? Thank you again. Marco Il 24/04/2012 10:57, Nick Cox ha scritto: > See -glmcorr- (SSC) for one approach here. That calculates an rmse > which appears similar, if not identical, to what you want. I like to > have such measures accessible for comparing -glm- results with those > of other models in which rmse appears naturally. Perhaps it is a > comfort blanket, but there you go. > > Note that putting a constant into a variable is usually overkill as > > di sqrt(e(dispers)) > > does the calculation. Use a scalar or local macro if you want to store > the value. > > On Tue, Apr 24, 2012 at 9:31 AM, Marco Ventura<mventura@istat.it> wrote: > >> from a GLM estimate I want to retrieve the conditional standard error of y >> given the covariates. If I do >> >> gen sigma=sqrt(e(dispers)) >> >> do I always get the right thing independently of any family and link? >> Should I correct it by sqrt(e(dispers)* (_N-1)/_N)? >> And do you think I should instead use the Pearson residuals such as >> >> gen sigma=sqrt(e(dispers_p)) >> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/