Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: moptimize gf evaluator
From
Klaus Pforr <[email protected]>
To
[email protected]
Subject
Re: st: moptimize gf evaluator
Date
Tue, 16 Nov 2010 16:00:59 +0100
<>
Thank you for extensive and helpful answer. I'm still waiting for my
copy of the new edition of "ML with stata".
And it is correct, that by adding the derivatives of the panel-level log
likelihood with respect to the beta coefficients, I enable my evaluator
to correctly handle the robust, cluster and svy options, isn't it? I
understand the 2006 edition of "ML with stata" and the help files, that
I need the observation level scores for that.
I want to implement a modified conditional logit estimator, and I want
it to handle the cluster option (robust and weights are not that
important, but they seem to processable, once you specify the scores,
which you need for the cluster option anyway). Therefore it seems
necessary to implement the evaluator as a gf2 type and not as d2 type
evaluator. I need the gradient and the hessian to gain speed and
precision, as the likelihood has a computationally complex denominator.
Klaus
Am 16.11.2010 15:34, schrieb Jeff Pitblado, StataCorp LP:
Klaus Pforr<[email protected]> is using Mata's -moptimize()- function
for fitting a panel-data model, and asks about the -gf2- evaluator type:
I posted this already a few days ago, but obviously put my question to
complicated to receive any answers. Is it true, that the gf type
evaluator for moptimize expects a score matrix of dimension L x K, with
L being the the number of independet elements and K being the number of
coefficients? I do not understand how I can derive such a matrix with
panel data, for which this evaluator type is recommended. Is this a
matrix of dL[j]/dbeta[i], with L[j] being the log likelihood
contribution of panel group j derived with respect to beta coeffient i?
The short answer is yes.
The following discussion show's how to implement a panel-data model using
-moptimize()-.
This kind of thing is difficult to discuss without a model to implment as an
example, so we'll use the random-effects linear regression model.
This model is described in the current -ml- book [see Gould et al. (2010),
chapter 15, section 6, page 268], and the derivatives are included in the
model description. The -ml- book exhibits the Stata code that implements the
-d2- evaluator for this model. We've adapted this code into a Mata function
called -rereg_gf2()- that is suitable as a -gf2- evaluator for -moptimize()-.
Other than translating the Stata code into Mata, we made the following two key
changes:
1. return the log-likelihood values at the panel level
2. return the derivatives of the panel-level log-likelihood values with
respect to the coefficients (each regression coefficient and the two scale
parameters)
Here is the Mata code for -rereg_gf2()-:
***** BEGIN: rereg_gf2.mata
version 11
mata:
void rereg_gf2( scalar M,
real scalar todo,
real vector b,
real vector lnfj,
real matrix S,
real matrix H)
{
real scalar s2_u
real scalar s2_e
real matrix info
real scalar k
real scalar i
real matrix sub
real vector z
real vector S_z2
real vector Sz_2
real vector T
real vector a
real matrix X
real scalar Xobs
real matrix subX
real matrix part
real vector S_z
real vector d2xb1
real vector d2xb2
real vector d2xb
real vector d2u
real vector d2e
real vector dxbdu
real vector dxbde
real vector dude
real vector lnf
// random effects scale parameter
s2_u = exp(2*moptimize_util_xb(M,b,2)) // scalar
// residual error scale parameter
s2_e = exp(2*moptimize_util_xb(M,b,3)) // scalar
// use the panel identifier to get the information we need to perform
// calculations at the panel level -- the data is assumed already
// sorted
info = panelsetup(moptimize_init_userinfo(M,1), 1)
// k is the number of panels
k = rows(info)
z = moptimize_init_depvar(M,1) :- moptimize_util_xb(M,b,1)
S_z2 = J(k,1,.)
Sz_2 = J(k,1,.)
T = J(k,1,.)
a = J(k,1,.)
// compute the log-likelihood values for each panel
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
S_z2[i] = cross(sub,sub)
Sz_2[i] = quadsum(sub)^2
T[i] = info[i,2] - info[i,1] + 1
a[i] = s2_u / (T[i]*s2_u + s2_e)
}
lnfj = -.5 * ((S_z2 :- a :* Sz_2)/s2_e :+
ln(T :* s2_u/s2_e :+ 1) :+
T:*ln(2*c("pi")*s2_e))
if (todo == 0 | missing(lnfj)) return
// compute the scores -- derivative of the above log-likelihood values
// with respect to each coefficient in the model
X = moptimize_util_indepvars(M, 1)
Xobs = (rows(X)> 1)
S = J(k,cols(b),.)
S_z = J(k,1,.)
part = moptimize_util_eq_indices(M,1)'
subX = 1
// derivatives with respect to the regression coefficients
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
S_z[i] = quadsum(sub)
if (Xobs) {
subX = panelsubmatrix(X,i,info)
}
part[1,1] = part[2,1] = i
S[|part|] = quadcolsum( ((sub :- a[i]*S_z[i])/s2_e) :* subX )
}
// derivatives with respect to ln(s_u)
part = moptimize_util_eq_indices(M,2)[2,2]
S[.,part] = a:^2 :* Sz_2/s2_u :- T :* a
// derivatives with respect to ln(s_e)
part++
S[.,part] = S_z2 :/ s2_e :-
a :* Sz_2 :/ s2_e :-
(a:^2) :* Sz_2 :/ s2_u :-
T :+ 1 :- a :* s2_e/s2_u
if (todo == 1 | missing(S)) return
// compute the Hessian
lnf = 1
d2u = colsum( 2*a:^2:*Sz_2:/s2_u :-
4*s2_e*a:^3:*Sz_2:/s2_u^2 :+
2*T:*a:^2*s2_e/s2_u )
d2e = colsum( 2*(S_z2 :- a :* Sz_2)/s2_e :-
2*a:^2:*Sz_2/s2_u :-
4*a:^3:*Sz_2*s2_e/s2_u^2 :+
2*a*s2_e/s2_u :-
2*a:^2*s2_e^2/s2_u^2 )
dude = colsum( 4*a:^3:*Sz_2*s2_e/s2_u^2 :-
2*T:*a:^2*s2_e/s2_u )
dxbdu = J(rows(z),1,.)
dxbde = J(rows(z),1,.)
part = J(2,2,1)
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
part[1,1] = info[i,1]
part[2,1] = info[i,2]
dxbdu[|part|] = J(T[i],1,2*a[i]^2*S_z[i]/s2_u)
dxbde[|part|] = 2*(sub :- a[i]*S_z[i])/s2_e :-
2*a[i]^2*S_z[i]/s2_u
}
dxbdu = moptimize_util_matsum(M, 1, 2, dxbdu, lnf)
dxbde = moptimize_util_matsum(M, 1, 3, dxbde, lnf)
d2xb2 = moptimize_util_matsum(M, 1, 1, 1, lnf)
sub = J(rows(z),1,.)
part = J(2,2,1)
for (i=1; i<=k; i++) {
part[1,1] = info[i,1]
part[2,1] = info[i,2]
sub[|part|] = J(T[i],1,0)
sub[info[i,2]] = a[i]
}
d2xb1 = moptimize_util_matbysum(M, 1, sub, J(rows(z),1,1), lnf)
d2xb = (d2xb2 - d2xb1)/s2_e
H = -(d2xb, dxbdu, dxbde \
dxbdu', d2u, dude \
dxbde', dude, d2e)
}
end
exit
***** END: rereg_gf2.mata
To show that the above evaluator works properly, here is an example that
fits a model using -moptimize()- with -rereg_gf2()- and compares the results
with -xtreg, mle-.
Notice that we -include- the Mata source for 'rereg_gf2.mata' at the top of
our example, we need to define -rereg_gf2()- for Mata before we can start
making references to it. The -clear mata- allows us to rerun our example
without having Mata complain that -rereg_gf2()- is already defined.
***** BEGIN: example.do
clear mata
include rereg_gf2.mata
use http://www.stata-press.com/data/ml4/tablef7-1
// starting values for the constant-only model
sum lnC, mean
scalar xb = r(mean)
quietly oneway lnC i
local np = r(df_m) + 1
local N = r(N) + 1
local bms = r(mss)/r(df_m) // between mean squares
local wms = r(rss)/r(df_r) // within mean squares
scalar lns_u = log( (`bms'-`wms')*`np'/`N' )/2
scalar lns_e = log(`wms')/2
sort i
mata:
// setup for the constant-only model
M = moptimize_init()
moptimize_init_evaluator(M,&rereg_gf2())
moptimize_init_evaluatortype(M, "gf2")
moptimize_init_valueid(M, "log-likelihood")
// variable that identifies the panels
st_view(panel=., ., "i")
moptimize_init_userinfo(M, 1, panel)
moptimize_init_by(M, panel) // needed by -moptimize_util_matbysum()-
// setup for the equations, including starting values
moptimize_init_eq_n(M, 3)
moptimize_init_eq_name(M, 1, "xb")
moptimize_init_eq_name(M, 2, "lns_u")
moptimize_init_eq_name(M, 3, "lns_e")
moptimize_init_depvar(M, 1, "lnC")
moptimize_init_eq_coefs(M, 1, st_numscalar("xb"))
moptimize_init_eq_coefs(M, 2, st_numscalar("lns_u"))
moptimize_init_eq_coefs(M, 3, st_numscalar("lns_e"))
// we've supplied the starting values, so we turn searching off
moptimize_init_search(M, "off")
// fit the constant-only model
moptimize(M)
moptimize_result_display(M)
// now use the resulting fit as starting values for another model that
// contains independent variables
b0 = moptimize_result_coefs(M)
moptimize_init_eq_indepvars(M, 1, "lnQ lnPF")
moptimize_init_eq_coefs(M, 1, (0,0,b0[1]))
moptimize_init_eq_coefs(M, 2, b0[2])
moptimize_init_eq_coefs(M, 3, b0[3])
moptimize(M)
moptimize_result_display(M)
// store the fitted coefficients in a Stata matrix so we can compare with the
// results from -xtreg, mle-
st_matrix("myb", moptimize_result_coefs(M))
end
xtset i
xtreg lnC lnQ lnPF , mle
matrix b = e(b)
local dim = colsof(b)
// transform the scale parameters to compare with -rereg_gf2()-'s results
matrix b[1,`dim'-1] = ln(b[1,`dim'-1])
matrix b[1,`dim'] = ln(b[1,`dim'])
// the maximum relative difference between the two model fits should be small
di mreldif(b, myb)
***** END: example.do
Here is a log of the output from running the above example:
***** BEGIN: example.log
. include rereg_gf2.mata
.
. version 11
.
. mata:
------------------------------------------------- mata (type end to exit) -----
:
: void rereg_gf2( scalar M,
real scalar todo,
real vector b,
real vector lnfj,
real matrix S,
real matrix H)
{
real scalar s2_u
real scalar s2_e
real matrix info
real scalar k
real scalar i
real matrix sub
real vector z
real vector S_z2
real vector Sz_2
real vector T
real vector a
real matrix X
real scalar Xobs
real matrix subX
real matrix part
real vector S_z
real vector d2xb1
real vector d2xb2
real vector d2xb
real vector d2u
real vector d2e
real vector dxbdu
real vector dxbde
real vector dude
real vector lnf
// random effects scale parameter
s2_u = exp(2*moptimize_util_xb(M,b,2)) // scalar
// residual error scale parameter
s2_e = exp(2*moptimize_util_xb(M,b,3)) // scalar
// use the panel identifier to get the information we need to perform
// calculations at the panel level -- the data is assumed already
// sorted
info = panelsetup(moptimize_init_userinfo(M,1), 1)
// k is the number of panels
k = rows(info)
z = moptimize_init_depvar(M,1) :- moptimize_util_xb(M,b,1)
S_z2 = J(k,1,.)
Sz_2 = J(k,1,.)
T = J(k,1,.)
a = J(k,1,.)
// compute the log-likelihood values for each panel
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
S_z2[i] = cross(sub,sub)
Sz_2[i] = quadsum(sub)^2
T[i] = info[i,2] - info[i,1] + 1
a[i] = s2_u / (T[i]*s2_u + s2_e)
}
lnfj = -.5 * ((S_z2 :- a :* Sz_2)/s2_e :+
ln(T :* s2_u/s2_e :+ 1) :+
T:*ln(2*c("pi")*s2_e))
if (todo == 0 | missing(lnfj)) return
// compute the scores -- derivative of the above log-likelihood values
// with respect to each coefficient in the model
X = moptimize_util_indepvars(M, 1)
Xobs = (rows(X)> 1)
S = J(k,cols(b),.)
S_z = J(k,1,.)
part = moptimize_util_eq_indices(M,1)'
subX = 1
// derivatives with respect to the regression coefficients
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
S_z[i] = quadsum(sub)
if (Xobs) {
subX = panelsubmatrix(X,i,info)
}
part[1,1] = part[2,1] = i
S[|part|] = quadcolsum( ((sub :- a[i]*S_z[i])/s2_e) :* subX )
}
// derivatives with respect to ln(s_u)
part = moptimize_util_eq_indices(M,2)[2,2]
S[.,part] = a:^2 :* Sz_2/s2_u :- T :* a
// derivatives with respect to ln(s_e)
part++
S[.,part] = S_z2 :/ s2_e :-
a :* Sz_2 :/ s2_e :-
(a:^2) :* Sz_2 :/ s2_u :-
T :+ 1 :- a :* s2_e/s2_u
if (todo == 1 | missing(S)) return
// compute the Hessian
lnf = 1
d2u = colsum( 2*a:^2:*Sz_2:/s2_u :-
4*s2_e*a:^3:*Sz_2:/s2_u^2 :+
2*T:*a:^2*s2_e/s2_u )
d2e = colsum( 2*(S_z2 :- a :* Sz_2)/s2_e :-
2*a:^2:*Sz_2/s2_u :-
4*a:^3:*Sz_2*s2_e/s2_u^2 :+
2*a*s2_e/s2_u :-
2*a:^2*s2_e^2/s2_u^2 )
dude = colsum( 4*a:^3:*Sz_2*s2_e/s2_u^2 :-
2*T:*a:^2*s2_e/s2_u )
dxbdu = J(rows(z),1,.)
dxbde = J(rows(z),1,.)
part = J(2,2,1)
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
part[1,1] = info[i,1]
part[2,1] = info[i,2]
dxbdu[|part|] = J(T[i],1,2*a[i]^2*S_z[i]/s2_u)
dxbde[|part|] = 2*(sub :- a[i]*S_z[i])/s2_e :-
2*a[i]^2*S_z[i]/s2_u
}
dxbdu = moptimize_util_matsum(M, 1, 2, dxbdu, lnf)
dxbde = moptimize_util_matsum(M, 1, 3, dxbde, lnf)
d2xb2 = moptimize_util_matsum(M, 1, 1, 1, lnf)
sub = J(rows(z),1,.)
part = J(2,2,1)
for (i=1; i<=k; i++) {
part[1,1] = info[i,1]
part[2,1] = info[i,2]
sub[|part|] = J(T[i],1,0)
sub[info[i,2]] = a[i]
}
d2xb1 = moptimize_util_matbysum(M, 1, sub, J(rows(z),1,1), lnf)
d2xb = (d2xb2 - d2xb1)/s2_e
H = -(d2xb, dxbdu, dxbde \
dxbdu', d2u, dude \
dxbde', dude, d2e)
}
:
: mata mosave rereg_gf2(), replace
(file rereg_gf2.mo created)
: end
-------------------------------------------------------------------------------
. exit
.
. use http://www.stata-press.com/data/ml4/tablef7-1
.
. // starting values for the constant-only model
. sum lnC, mean
. scalar xb = r(mean)
. quietly oneway lnC i
. local np = r(df_m) + 1
. local N = r(N) + 1
. local bms = r(mss)/r(df_m) // between mean squares
. local wms = r(rss)/r(df_r) // within mean squares
. scalar lns_u = log( (`bms'-`wms')*`np'/`N' )/2
. scalar lns_e = log(`wms')/2
.
. sort i
.
. mata:
------------------------------------------------- mata (type end to exit) -----
:
: // setup for the constant-only model
: M = moptimize_init()
: moptimize_init_evaluator(M,&rereg_gf2())
: moptimize_init_evaluatortype(M, "gf2")
: moptimize_init_valueid(M, "log-likelihood")
:
: // variable that identifies the panels
: st_view(panel=., ., "i")
: moptimize_init_userinfo(M, 1, panel)
: moptimize_init_by(M, panel) // needed by -moptimize_util_matbysum()-
:
: // setup for the equations, including starting values
: moptimize_init_eq_n(M, 3)
: moptimize_init_eq_name(M, 1, "xb")
: moptimize_init_eq_name(M, 2, "lns_u")
: moptimize_init_eq_name(M, 3, "lns_e")
: moptimize_init_depvar(M, 1, "lnC")
: moptimize_init_eq_coefs(M, 1, st_numscalar("xb"))
: moptimize_init_eq_coefs(M, 2, st_numscalar("lns_u"))
: moptimize_init_eq_coefs(M, 3, st_numscalar("lns_e"))
:
: // we've supplied the starting values, so we turn searching off
: moptimize_init_search(M, "off")
:
: // fit the constant-only model
: moptimize(M)
Iteration 0: log-likelihood = -103.47286
Iteration 1: log-likelihood = -103.43139
Iteration 2: log-likelihood = -103.4311
Iteration 3: log-likelihood = -103.4311
: moptimize_result_display(M)
Number of obs = 90
------------------------------------------------------------------------------
lnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb |
_cons | 13.36561 .3718818 35.94 0.000 12.63673 14.09448
-------------+----------------------------------------------------------------
lns_u |
_cons | -.1124866 .2999833 -0.37 0.708 -.7004432 .4754699
-------------+----------------------------------------------------------------
lns_e |
_cons | -.3790205 .0771517 -4.91 0.000 -.530235 -.227806
------------------------------------------------------------------------------
:
: // now use the resulting fit as starting values for another model that
: // contains independent variables
: b0 = moptimize_result_coefs(M)
: moptimize_init_eq_indepvars(M, 1, "lnQ lnPF")
: moptimize_init_eq_coefs(M, 1, (0,0,b0[1]))
: moptimize_init_eq_coefs(M, 2, b0[2])
: moptimize_init_eq_coefs(M, 3, b0[3])
: moptimize(M)
Iteration 0: log-likelihood = -103.4311 (not concave)
Iteration 1: log-likelihood = -36.516249 (not concave)
Iteration 2: log-likelihood = 25.743727 (not concave)
Iteration 3: log-likelihood = 38.3557 (not concave)
Iteration 4: log-likelihood = 52.292856 (not concave)
Iteration 5: log-likelihood = 62.34649 (not concave)
Iteration 6: log-likelihood = 70.57777 (not concave)
Iteration 7: log-likelihood = 80.016528
Iteration 8: log-likelihood = 98.680215
Iteration 9: log-likelihood = 102.04372
Iteration 10: log-likelihood = 102.05915
Iteration 11: log-likelihood = 102.05915
: moptimize_result_display(M)
Number of obs = 90
------------------------------------------------------------------------------
lnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb |
lnQ | .8649719 .0264724 32.67 0.000 .8130869 .9168569
lnPF | .3993113 .0146384 27.28 0.000 .3706205 .4280021
_cons | 9.282005 .2179675 42.58 0.000 8.854796 9.709213
-------------+----------------------------------------------------------------
lns_u |
_cons | -2.131847 .295808 -7.21 0.000 -2.71162 -1.552074
-------------+----------------------------------------------------------------
lns_e |
_cons | -2.680509 .0771648 -34.74 0.000 -2.831749 -2.529268
------------------------------------------------------------------------------
:
: // store the fitted coefficients in a Stata matrix so we can compare with the
: // results from -xtreg, mle-
: st_matrix("myb", moptimize_result_coefs(M))
:
: end
-------------------------------------------------------------------------------
.
. xtset i
panel variable: i (balanced)
. xtreg lnC lnQ lnPF , mle
Fitting constant-only model:
Iteration 0: log likelihood = -3679.9546
Iteration 1: log likelihood = -2042.3896
Iteration 2: log likelihood = -1132.4148
Iteration 3: log likelihood = -631.57621
Iteration 4: log likelihood = -360.64085
Iteration 5: log likelihood = -218.61016
Iteration 6: log likelihood = -148.36245
Iteration 7: log likelihood = -117.25507
Iteration 8: log likelihood = -106.17927
Iteration 9: log likelihood = -103.65887
Iteration 10: log likelihood = -103.43395
Iteration 11: log likelihood = -103.4311
Fitting full model:
Iteration 0: log likelihood = 101.61076
Iteration 1: log likelihood = 102.03853
Iteration 2: log likelihood = 102.05913
Iteration 3: log likelihood = 102.05915
Random-effects ML regression Number of obs = 90
Group variable: i Number of groups = 6
Random effects u_i ~ Gaussian Obs per group: min = 15
avg = 15.0
max = 15
LR chi2(2) = 410.98
Log likelihood = 102.05915 Prob> chi2 = 0.0000
------------------------------------------------------------------------------
lnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnQ | .8649719 .0264724 32.67 0.000 .813087 .9168569
lnPF | .3993113 .0146384 27.28 0.000 .3706205 .4280021
_cons | 9.282005 .2179674 42.58 0.000 8.854796 9.709213
-------------+----------------------------------------------------------------
/sigma_u | .1186181 .0350882 .0664291 .2118083
/sigma_e | .0685283 .005288 .0589098 .0797173
rho | .7497583 .114912 .4861774 .9165317
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01)= 101.26 Prob>=chibar2 = 0.000
. matrix b = e(b)
. local dim = colsof(b)
. // transform the scale parameters to compare with -rereg_gf2()-'s results
. matrix b[1,`dim'-1] = ln(b[1,`dim'-1])
. matrix b[1,`dim'] = ln(b[1,`dim'])
.
. // the maximum relative difference between the two model fits should be small
. di mreldif(b, myb)
5.354e-08
***** END: example.log
Reference:
Gould, W., J. Pitblado, and B. Poi. (2010) Maximum likelihood estimation with
Stata. StataPress: CS, TX.
--Jeff
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
__________________________________
Klaus Pforr
MZES AB - A
Universität Mannheim
D - 68131 Mannheim
Tel: +49-621-181 2801 (nachmittags)
fax: +49-621-181 2803
URL: http://www.mzes.uni-mannheim.de
Besucheranschrift: A5, Raum A312
__________________________________
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/