Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: moptimize gf evaluator
From
[email protected] (Jeff Pitblado, StataCorp LP)
To
[email protected]
Subject
Re: st: moptimize gf evaluator
Date
Tue, 16 Nov 2010 08:34:31 -0600
Klaus Pforr <[email protected]> is using Mata's -moptimize()- function
for fitting a panel-data model, and asks about the -gf2- evaluator type:
> I posted this already a few days ago, but obviously put my question to
> complicated to receive any answers. Is it true, that the gf type
> evaluator for moptimize expects a score matrix of dimension L x K, with
> L being the the number of independet elements and K being the number of
> coefficients? I do not understand how I can derive such a matrix with
> panel data, for which this evaluator type is recommended. Is this a
> matrix of dL[j]/dbeta[i], with L[j] being the log likelihood
> contribution of panel group j derived with respect to beta coeffient i?
The short answer is yes.
The following discussion show's how to implement a panel-data model using
-moptimize()-.
This kind of thing is difficult to discuss without a model to implment as an
example, so we'll use the random-effects linear regression model.
This model is described in the current -ml- book [see Gould et al. (2010),
chapter 15, section 6, page 268], and the derivatives are included in the
model description. The -ml- book exhibits the Stata code that implements the
-d2- evaluator for this model. We've adapted this code into a Mata function
called -rereg_gf2()- that is suitable as a -gf2- evaluator for -moptimize()-.
Other than translating the Stata code into Mata, we made the following two key
changes:
1. return the log-likelihood values at the panel level
2. return the derivatives of the panel-level log-likelihood values with
respect to the coefficients (each regression coefficient and the two scale
parameters)
Here is the Mata code for -rereg_gf2()-:
***** BEGIN: rereg_gf2.mata
version 11
mata:
void rereg_gf2( scalar M,
real scalar todo,
real vector b,
real vector lnfj,
real matrix S,
real matrix H)
{
real scalar s2_u
real scalar s2_e
real matrix info
real scalar k
real scalar i
real matrix sub
real vector z
real vector S_z2
real vector Sz_2
real vector T
real vector a
real matrix X
real scalar Xobs
real matrix subX
real matrix part
real vector S_z
real vector d2xb1
real vector d2xb2
real vector d2xb
real vector d2u
real vector d2e
real vector dxbdu
real vector dxbde
real vector dude
real vector lnf
// random effects scale parameter
s2_u = exp(2*moptimize_util_xb(M,b,2)) // scalar
// residual error scale parameter
s2_e = exp(2*moptimize_util_xb(M,b,3)) // scalar
// use the panel identifier to get the information we need to perform
// calculations at the panel level -- the data is assumed already
// sorted
info = panelsetup(moptimize_init_userinfo(M,1), 1)
// k is the number of panels
k = rows(info)
z = moptimize_init_depvar(M,1) :- moptimize_util_xb(M,b,1)
S_z2 = J(k,1,.)
Sz_2 = J(k,1,.)
T = J(k,1,.)
a = J(k,1,.)
// compute the log-likelihood values for each panel
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
S_z2[i] = cross(sub,sub)
Sz_2[i] = quadsum(sub)^2
T[i] = info[i,2] - info[i,1] + 1
a[i] = s2_u / (T[i]*s2_u + s2_e)
}
lnfj = -.5 * ((S_z2 :- a :* Sz_2)/s2_e :+
ln(T :* s2_u/s2_e :+ 1) :+
T:*ln(2*c("pi")*s2_e))
if (todo == 0 | missing(lnfj)) return
// compute the scores -- derivative of the above log-likelihood values
// with respect to each coefficient in the model
X = moptimize_util_indepvars(M, 1)
Xobs = (rows(X) > 1)
S = J(k,cols(b),.)
S_z = J(k,1,.)
part = moptimize_util_eq_indices(M,1)'
subX = 1
// derivatives with respect to the regression coefficients
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
S_z[i] = quadsum(sub)
if (Xobs) {
subX = panelsubmatrix(X,i,info)
}
part[1,1] = part[2,1] = i
S[|part|] = quadcolsum( ((sub :- a[i]*S_z[i])/s2_e) :* subX )
}
// derivatives with respect to ln(s_u)
part = moptimize_util_eq_indices(M,2)[2,2]
S[.,part] = a:^2 :* Sz_2/s2_u :- T :* a
// derivatives with respect to ln(s_e)
part++
S[.,part] = S_z2 :/ s2_e :-
a :* Sz_2 :/ s2_e :-
(a:^2) :* Sz_2 :/ s2_u :-
T :+ 1 :- a :* s2_e/s2_u
if (todo == 1 | missing(S)) return
// compute the Hessian
lnf = 1
d2u = colsum( 2*a:^2:*Sz_2:/s2_u :-
4*s2_e*a:^3:*Sz_2:/s2_u^2 :+
2*T:*a:^2*s2_e/s2_u )
d2e = colsum( 2*(S_z2 :- a :* Sz_2)/s2_e :-
2*a:^2:*Sz_2/s2_u :-
4*a:^3:*Sz_2*s2_e/s2_u^2 :+
2*a*s2_e/s2_u :-
2*a:^2*s2_e^2/s2_u^2 )
dude = colsum( 4*a:^3:*Sz_2*s2_e/s2_u^2 :-
2*T:*a:^2*s2_e/s2_u )
dxbdu = J(rows(z),1,.)
dxbde = J(rows(z),1,.)
part = J(2,2,1)
for (i=1; i<=k; i++) {
sub = panelsubmatrix(z, i, info)
part[1,1] = info[i,1]
part[2,1] = info[i,2]
dxbdu[|part|] = J(T[i],1,2*a[i]^2*S_z[i]/s2_u)
dxbde[|part|] = 2*(sub :- a[i]*S_z[i])/s2_e :-
2*a[i]^2*S_z[i]/s2_u
}
dxbdu = moptimize_util_matsum(M, 1, 2, dxbdu, lnf)
dxbde = moptimize_util_matsum(M, 1, 3, dxbde, lnf)
d2xb2 = moptimize_util_matsum(M, 1, 1, 1, lnf)
sub = J(rows(z),1,.)
part = J(2,2,1)
for (i=1; i<=k; i++) {
part[1,1] = info[i,1]
part[2,1] = info[i,2]
sub[|part|] = J(T[i],1,0)
sub[info[i,2]] = a[i]
}
d2xb1 = moptimize_util_matbysum(M, 1, sub, J(rows(z),1,1), lnf)
d2xb = (d2xb2 - d2xb1)/s2_e
H = -(d2xb, dxbdu, dxbde \
dxbdu', d2u, dude \
dxbde', dude, d2e)
}
end
exit
***** END: rereg_gf2.mata
To show that the above evaluator works properly, here is an example that
fits a model using -moptimize()- with -rereg_gf2()- and compares the results
with -xtreg, mle-.
Notice that we -include- the Mata source for 'rereg_gf2.mata' at the top of
our example, we need to define -rereg_gf2()- for Mata before we can start
making references to it. The -clear mata- allows us to rerun our example
without having Mata complain that -rereg_gf2()- is already defined.
***** BEGIN: example.do
clear mata
include rereg_gf2.mata
use http://www.stata-press.com/data/ml4/tablef7-1
// starting values for the constant-only model
sum lnC, mean
scalar xb = r(mean)
quietly oneway lnC i
local np = r(df_m) + 1
local N = r(N) + 1
local bms = r(mss)/r(df_m) // between mean squares
local wms = r(rss)/r(df_r) // within mean squares
scalar lns_u = log( (`bms'-`wms')*`np'/`N' )/2
scalar lns_e = log(`wms')/2
sort i
mata:
// setup for the constant-only model
M = moptimize_init()
moptimize_init_evaluator(M, &rereg_gf2())
moptimize_init_evaluatortype(M, "gf2")
moptimize_init_valueid(M, "log-likelihood")
// variable that identifies the panels
st_view(panel=., ., "i")
moptimize_init_userinfo(M, 1, panel)
moptimize_init_by(M, panel) // needed by -moptimize_util_matbysum()-
// setup for the equations, including starting values
moptimize_init_eq_n(M, 3)
moptimize_init_eq_name(M, 1, "xb")
moptimize_init_eq_name(M, 2, "lns_u")
moptimize_init_eq_name(M, 3, "lns_e")
moptimize_init_depvar(M, 1, "lnC")
moptimize_init_eq_coefs(M, 1, st_numscalar("xb"))
moptimize_init_eq_coefs(M, 2, st_numscalar("lns_u"))
moptimize_init_eq_coefs(M, 3, st_numscalar("lns_e"))
// we've supplied the starting values, so we turn searching off
moptimize_init_search(M, "off")
// fit the constant-only model
moptimize(M)
moptimize_result_display(M)
// now use the resulting fit as starting values for another model that
// contains independent variables
b0 = moptimize_result_coefs(M)
moptimize_init_eq_indepvars(M, 1, "lnQ lnPF")
moptimize_init_eq_coefs(M, 1, (0,0,b0[1]))
moptimize_init_eq_coefs(M, 2, b0[2])
moptimize_init_eq_coefs(M, 3, b0[3])
moptimize(M)
moptimize_result_display(M)
// store the fitted coefficients in a Stata matrix so we can compare with the
// results from -xtreg, mle-
st_matrix("myb", moptimize_result_coefs(M))
end
xtset i
xtreg lnC lnQ lnPF , mle
matrix b = e(b)
local dim = colsof(b)
// transform the scale parameters to compare with -rereg_gf2()-'s results
matrix b[1,`dim'-1] = ln(b[1,`dim'-1])
matrix b[1,`dim'] = ln(b[1,`dim'])
// the maximum relative difference between the two model fits should be small
di mreldif(b, myb)
***** END: example.do
Here is a log of the output from running the above example:
***** BEGIN: example.log
. include rereg_gf2.mata
.
. version 11
.
. mata:
------------------------------------------------- mata (type end to exit) -----
:
: void rereg_gf2( scalar M,
> real scalar todo,
> real vector b,
> real vector lnfj,
> real matrix S,
> real matrix H)
> {
> real scalar s2_u
> real scalar s2_e
>
> real matrix info
> real scalar k
> real scalar i
>
> real matrix sub
> real vector z
> real vector S_z2
> real vector Sz_2
> real vector T
> real vector a
> real matrix X
> real scalar Xobs
> real matrix subX
> real matrix part
> real vector S_z
>
> real vector d2xb1
> real vector d2xb2
> real vector d2xb
> real vector d2u
> real vector d2e
> real vector dxbdu
> real vector dxbde
> real vector dude
> real vector lnf
>
> // random effects scale parameter
> s2_u = exp(2*moptimize_util_xb(M,b,2)) // scalar
> // residual error scale parameter
> s2_e = exp(2*moptimize_util_xb(M,b,3)) // scalar
>
> // use the panel identifier to get the information we need to perform
> // calculations at the panel level -- the data is assumed already
> // sorted
> info = panelsetup(moptimize_init_userinfo(M,1), 1)
> // k is the number of panels
> k = rows(info)
>
> z = moptimize_init_depvar(M,1) :- moptimize_util_xb(M,b,1)
> S_z2 = J(k,1,.)
> Sz_2 = J(k,1,.)
> T = J(k,1,.)
> a = J(k,1,.)
>
> // compute the log-likelihood values for each panel
> for (i=1; i<=k; i++) {
> sub = panelsubmatrix(z, i, info)
> S_z2[i] = cross(sub,sub)
> Sz_2[i] = quadsum(sub)^2
> T[i] = info[i,2] - info[i,1] + 1
> a[i] = s2_u / (T[i]*s2_u + s2_e)
> }
> lnfj = -.5 * ((S_z2 :- a :* Sz_2)/s2_e :+
> ln(T :* s2_u/s2_e :+ 1) :+
> T:*ln(2*c("pi")*s2_e))
> if (todo == 0 | missing(lnfj)) return
>
> // compute the scores -- derivative of the above log-likelihood values
> // with respect to each coefficient in the model
> X = moptimize_util_indepvars(M, 1)
> Xobs = (rows(X) > 1)
> S = J(k,cols(b),.)
> S_z = J(k,1,.)
> part = moptimize_util_eq_indices(M,1)'
> subX = 1
> // derivatives with respect to the regression coefficients
> for (i=1; i<=k; i++) {
> sub = panelsubmatrix(z, i, info)
> S_z[i] = quadsum(sub)
> if (Xobs) {
> subX = panelsubmatrix(X,i,info)
> }
> part[1,1] = part[2,1] = i
> S[|part|] = quadcolsum( ((sub :- a[i]*S_z[i])/s2_e) :* subX )
> }
> // derivatives with respect to ln(s_u)
> part = moptimize_util_eq_indices(M,2)[2,2]
> S[.,part] = a:^2 :* Sz_2/s2_u :- T :* a
> // derivatives with respect to ln(s_e)
> part++
> S[.,part] = S_z2 :/ s2_e :-
> a :* Sz_2 :/ s2_e :-
> (a:^2) :* Sz_2 :/ s2_u :-
> T :+ 1 :- a :* s2_e/s2_u
> if (todo == 1 | missing(S)) return
>
> // compute the Hessian
> lnf = 1
> d2u = colsum( 2*a:^2:*Sz_2:/s2_u :-
> 4*s2_e*a:^3:*Sz_2:/s2_u^2 :+
> 2*T:*a:^2*s2_e/s2_u )
> d2e = colsum( 2*(S_z2 :- a :* Sz_2)/s2_e :-
> 2*a:^2:*Sz_2/s2_u :-
> 4*a:^3:*Sz_2*s2_e/s2_u^2 :+
> 2*a*s2_e/s2_u :-
> 2*a:^2*s2_e^2/s2_u^2 )
> dude = colsum( 4*a:^3:*Sz_2*s2_e/s2_u^2 :-
> 2*T:*a:^2*s2_e/s2_u )
>
> dxbdu = J(rows(z),1,.)
> dxbde = J(rows(z),1,.)
> part = J(2,2,1)
> for (i=1; i<=k; i++) {
> sub = panelsubmatrix(z, i, info)
> part[1,1] = info[i,1]
> part[2,1] = info[i,2]
> dxbdu[|part|] = J(T[i],1,2*a[i]^2*S_z[i]/s2_u)
> dxbde[|part|] = 2*(sub :- a[i]*S_z[i])/s2_e :-
> 2*a[i]^2*S_z[i]/s2_u
> }
> dxbdu = moptimize_util_matsum(M, 1, 2, dxbdu, lnf)
> dxbde = moptimize_util_matsum(M, 1, 3, dxbde, lnf)
>
> d2xb2 = moptimize_util_matsum(M, 1, 1, 1, lnf)
> sub = J(rows(z),1,.)
> part = J(2,2,1)
> for (i=1; i<=k; i++) {
> part[1,1] = info[i,1]
> part[2,1] = info[i,2]
> sub[|part|] = J(T[i],1,0)
> sub[info[i,2]] = a[i]
> }
> d2xb1 = moptimize_util_matbysum(M, 1, sub, J(rows(z),1,1), lnf)
> d2xb = (d2xb2 - d2xb1)/s2_e
> H = -(d2xb, dxbdu, dxbde \
> dxbdu', d2u, dude \
> dxbde', dude, d2e)
> }
:
: mata mosave rereg_gf2(), replace
(file rereg_gf2.mo created)
: end
-------------------------------------------------------------------------------
. exit
.
. use http://www.stata-press.com/data/ml4/tablef7-1
.
. // starting values for the constant-only model
. sum lnC, mean
. scalar xb = r(mean)
. quietly oneway lnC i
. local np = r(df_m) + 1
. local N = r(N) + 1
. local bms = r(mss)/r(df_m) // between mean squares
. local wms = r(rss)/r(df_r) // within mean squares
. scalar lns_u = log( (`bms'-`wms')*`np'/`N' )/2
. scalar lns_e = log(`wms')/2
.
. sort i
.
. mata:
------------------------------------------------- mata (type end to exit) -----
:
: // setup for the constant-only model
: M = moptimize_init()
: moptimize_init_evaluator(M, &rereg_gf2())
: moptimize_init_evaluatortype(M, "gf2")
: moptimize_init_valueid(M, "log-likelihood")
:
: // variable that identifies the panels
: st_view(panel=., ., "i")
: moptimize_init_userinfo(M, 1, panel)
: moptimize_init_by(M, panel) // needed by -moptimize_util_matbysum()-
:
: // setup for the equations, including starting values
: moptimize_init_eq_n(M, 3)
: moptimize_init_eq_name(M, 1, "xb")
: moptimize_init_eq_name(M, 2, "lns_u")
: moptimize_init_eq_name(M, 3, "lns_e")
: moptimize_init_depvar(M, 1, "lnC")
: moptimize_init_eq_coefs(M, 1, st_numscalar("xb"))
: moptimize_init_eq_coefs(M, 2, st_numscalar("lns_u"))
: moptimize_init_eq_coefs(M, 3, st_numscalar("lns_e"))
:
: // we've supplied the starting values, so we turn searching off
: moptimize_init_search(M, "off")
:
: // fit the constant-only model
: moptimize(M)
Iteration 0: log-likelihood = -103.47286
Iteration 1: log-likelihood = -103.43139
Iteration 2: log-likelihood = -103.4311
Iteration 3: log-likelihood = -103.4311
: moptimize_result_display(M)
Number of obs = 90
------------------------------------------------------------------------------
lnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb |
_cons | 13.36561 .3718818 35.94 0.000 12.63673 14.09448
-------------+----------------------------------------------------------------
lns_u |
_cons | -.1124866 .2999833 -0.37 0.708 -.7004432 .4754699
-------------+----------------------------------------------------------------
lns_e |
_cons | -.3790205 .0771517 -4.91 0.000 -.530235 -.227806
------------------------------------------------------------------------------
:
: // now use the resulting fit as starting values for another model that
: // contains independent variables
: b0 = moptimize_result_coefs(M)
: moptimize_init_eq_indepvars(M, 1, "lnQ lnPF")
: moptimize_init_eq_coefs(M, 1, (0,0,b0[1]))
: moptimize_init_eq_coefs(M, 2, b0[2])
: moptimize_init_eq_coefs(M, 3, b0[3])
: moptimize(M)
Iteration 0: log-likelihood = -103.4311 (not concave)
Iteration 1: log-likelihood = -36.516249 (not concave)
Iteration 2: log-likelihood = 25.743727 (not concave)
Iteration 3: log-likelihood = 38.3557 (not concave)
Iteration 4: log-likelihood = 52.292856 (not concave)
Iteration 5: log-likelihood = 62.34649 (not concave)
Iteration 6: log-likelihood = 70.57777 (not concave)
Iteration 7: log-likelihood = 80.016528
Iteration 8: log-likelihood = 98.680215
Iteration 9: log-likelihood = 102.04372
Iteration 10: log-likelihood = 102.05915
Iteration 11: log-likelihood = 102.05915
: moptimize_result_display(M)
Number of obs = 90
------------------------------------------------------------------------------
lnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb |
lnQ | .8649719 .0264724 32.67 0.000 .8130869 .9168569
lnPF | .3993113 .0146384 27.28 0.000 .3706205 .4280021
_cons | 9.282005 .2179675 42.58 0.000 8.854796 9.709213
-------------+----------------------------------------------------------------
lns_u |
_cons | -2.131847 .295808 -7.21 0.000 -2.71162 -1.552074
-------------+----------------------------------------------------------------
lns_e |
_cons | -2.680509 .0771648 -34.74 0.000 -2.831749 -2.529268
------------------------------------------------------------------------------
:
: // store the fitted coefficients in a Stata matrix so we can compare with the
: // results from -xtreg, mle-
: st_matrix("myb", moptimize_result_coefs(M))
:
: end
-------------------------------------------------------------------------------
.
. xtset i
panel variable: i (balanced)
. xtreg lnC lnQ lnPF , mle
Fitting constant-only model:
Iteration 0: log likelihood = -3679.9546
Iteration 1: log likelihood = -2042.3896
Iteration 2: log likelihood = -1132.4148
Iteration 3: log likelihood = -631.57621
Iteration 4: log likelihood = -360.64085
Iteration 5: log likelihood = -218.61016
Iteration 6: log likelihood = -148.36245
Iteration 7: log likelihood = -117.25507
Iteration 8: log likelihood = -106.17927
Iteration 9: log likelihood = -103.65887
Iteration 10: log likelihood = -103.43395
Iteration 11: log likelihood = -103.4311
Fitting full model:
Iteration 0: log likelihood = 101.61076
Iteration 1: log likelihood = 102.03853
Iteration 2: log likelihood = 102.05913
Iteration 3: log likelihood = 102.05915
Random-effects ML regression Number of obs = 90
Group variable: i Number of groups = 6
Random effects u_i ~ Gaussian Obs per group: min = 15
avg = 15.0
max = 15
LR chi2(2) = 410.98
Log likelihood = 102.05915 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnQ | .8649719 .0264724 32.67 0.000 .813087 .9168569
lnPF | .3993113 .0146384 27.28 0.000 .3706205 .4280021
_cons | 9.282005 .2179674 42.58 0.000 8.854796 9.709213
-------------+----------------------------------------------------------------
/sigma_u | .1186181 .0350882 .0664291 .2118083
/sigma_e | .0685283 .005288 .0589098 .0797173
rho | .7497583 .114912 .4861774 .9165317
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01)= 101.26 Prob>=chibar2 = 0.000
. matrix b = e(b)
. local dim = colsof(b)
. // transform the scale parameters to compare with -rereg_gf2()-'s results
. matrix b[1,`dim'-1] = ln(b[1,`dim'-1])
. matrix b[1,`dim'] = ln(b[1,`dim'])
.
. // the maximum relative difference between the two model fits should be small
. di mreldif(b, myb)
5.354e-08
***** END: example.log
Reference:
Gould, W., J. Pitblado, and B. Poi. (2010) Maximum likelihood estimation with
Stata. StataPress: CS, TX.
--Jeff
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/