--Alfonso Miranda<[email protected]> wrote:
> While doing an extension of the code weibhet_glfa I
> found what I believe is a mistake in the
> log-likelihood expression of this code.
[...]
> I have written down the log-likelihood obtaining (in an intermediate
> expression that helps comparison) the expression:
>
> Logl = ln{1+theta*exp[-x�b*p]*t^(p)}^(-(1/theta + d))
> + ln{exp[-x�b*p]*p*t^(p-1)}
> (3)
>
> Where x is the vector of observed characteristics and
> b is its corresponding vector of coefficients. In the
> weibhet_glfa code the log-likelihood is written as
>
> Logl = ln{1+theta*exp[-x�b*p]*t^(p)}^(-(1/theta + d))+
> + ln {exp[-x�b*p]*p*t^(p)}
> (4)
[...]
Alfonso noticed the difference between the equation (3) and (4), and wondered
whether -weibhet_glfa- made a mistake.
Short answer:
ML will get the same estimated parameters from either equation(3) or (4). The
program chooses equation (4) because the log-likelihood value will keep
constant in this form when the scale of survival time changes.
Long answer:
Compared the two equations, the difference lies on a term ln(t). Note that ML
finds the estimates by the first derivatives (of b and p), and the term ln(t)
will not be presented in the first derivatives. The two log-likelihood
functions will result in the same estimation results except the log-likelihood
value.
In fact, the program is written in this way on purpose such that the value of
the log-likelihood will be invariant to the scale of survival time. Let's see
a simple example with Weibull model:
. use http://www.stata-press.com/data/r7/cancer, clear
(Patient Survival in Drug Trial)
. stset studytime, fail(died)
(output omitted)
. streg drug age, dist(weibull) nolog
failure _d: died
analysis time _t: studytime
Weibull regression -- log relative-hazard form
No. of subjects = 48 Number of obs = 48
No. of failures = 31
Time at risk = 744
LR chi2(2) = 35.92
Log likelihood = -42.662838 Prob > chi2 = 0.0000
[...]
Now we rescale the time variable "_t" by 100 and re-estimate the model:
. replace _t = _t/100
_t was byte now float
(48 real changes made)
. streg drug age, dist(weibull) nolog
failure _d: died
analysis time _t: studytime
Weibull regression -- log relative-hazard form
No. of subjects = 48 Number of obs = 48
No. of failures = 31
Time at risk = 7.439999977
LR chi2(2) = 35.92
Log likelihood = -42.662838 Prob > chi2 = 0.0000
[...]
The value of log-likelihood does not change when the scale of _t changes.
This trick can be also found in some other models with -streg-. To get back
the original value of log-likelihood, we can simply adjust the calculated
log-likelihood by the term ln(t).
. gen double lnt = _d*ln(_t)
. summarize lnt if e(sample)
. di e(ll) - r(sum)
Weihua Guan <[email protected]>
Stata Corp.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/