Title | Stata 5: Pseudo R2 in weibull | |
Author | William Sribney, StataCorp |
The weibull command is different from some other ML commands in that the first iteration does not correspond to the constant-only model. For weibull (unlike ereg) there is no simple quick way to compute the constant-only model. It has to be iterated.
In the following sample output, iterations 0–3 compute the constant-only model:
Iteration 0: Log Likelihood = -61.34299 Log(sigma)= 0 Iteration 1: Log Likelihood = -60.65666 Log(sigma)= -.2060869 Iteration 2: Log Likelihood = -60.62434 Log(sigma)= -.1859658 Iteration 3: Log Likelihood = -60.62402 Log(sigma)= -.1882137
The remaining iterations compute the full model starting from the constant-only model. Note that the log likelihood of iteration 4 is the same as iteration 3:
Iteration 4: Log Likelihood = -60.62402 Log(sigma)= -.1882429 Iteration 5: Log Likelihood = -50.56099 Log(sigma)= -.2763068 Iteration 6: Log Likelihood = -47.38802 Log(sigma)= -.0988737 Iteration 7: Log Likelihood = -44.35852 Log(sigma)= -.6532183 Iteration 8: Log Likelihood = -42.8621 Log(sigma)= -.5385912 Iteration 9: Log Likelihood = -42.66388 Log(sigma)= -.5630554 Iteration 10: Log Likelihood = -42.66284 Log(sigma)= -.5639635 Weibull regression (log relative hazard form) Number of obs = 48 Sigma = 0.569 Model chi2(2) = 35.922 Std Err(Sigma) = 0.074 Prob > chi2 = 0.0000 Log Likelihood = -42.663 Pseudo R2 = 0.3132
Thus the model chi2 is 2*(-42.66284 - -60.62402) = 35.92236
Concerning the pseudo-R2, for ML models with discrete outcomes, we use the formula
pseudo-R2 = 1 - L1/L0
where LO and L1 are the constant-only and full model log likelihoods respectively. For discrete outcomes, the log likelihood is the log of a probability, so it is always negative. For continuous outcomes, the log likelihood is the log of a density. Since density functions can be greater than 1 (cf. the normal density at 0), the log likelihood can be positive or negative. Thus the formula 1 - L1/L0 could give a value greater than 1!
Thus there are no standard formulas for pseudo-R2s for continuous ML models. So as a rough guide for model fitting, we have devised ad hoc formulas for some commands like weibull that take advantage of the form of the likelihood. For weibull, the formula is
pseudo R2 = 1 - sigma/sigma0
The reasoning behind this choice of a pseudo R2 is as follows. The Weibull model can be written as
y = a + x*beta + sigma*e
where y = log(time) and e is an error term. Hence, the more terms one includes in the model, the smaller the estimate of sigma. This also proves 0 <= pseudo R2 <= 1.
We do not intend that our pseudo R2 should be reported in formal write-ups of results. The idea of a pseudo R2 came from economists who wanted some rough measure of explanatory power of the model. So it’s really just a guide for fitting models. A small pseudo R2 should make one humble about the model's explanatory ability, but a big pseudo R2 should not be taken as something necessarily wonderful.
Regarding the likelihoods from the examples in the manual, they are indeed positive. The maximum of the likelihood is at a point where the density of the joint distribution is > 1.