Hello again list,
On the off chance that anybody else is at all interested, I've managed to
solve the problem I outlined below, and now have something that allows
-optimize()- to combine analytical and numerical derivatives.
I'm not convinced that the solution I have is the most elegant way of
proceeding, but it appears to work. Outline below:
/* Begin Code */
function my_optimizer()
{
transmorphic scalar R, S
...
// define handle for v0 evaluator, and validate as if called by optimize()
S = optimize_init()
optimize_init_evaluator(S,&v0evaluator())
optimize_init_evaluatortype(S,"v0")
optimize_init_params(S,lambda)
optimize_init_argument(S,1,beta)
(void) opt__validate(S)
// define handle for v1 evaluator, passing R and S as arguments
R = optimize_init()
optimize_init_params(R,theta)
optimize_init_evaluator(R,&v1evaluator())
optimize_init_evaluatortype(R,"v1")
optimize_init_argument(R,1,R)
optimize_init_argument(R,2,S)
(void) optimize(main)
...
}
void v1evaluator(todo, param, transmorphic scalar R, struct__optstruct
scalar S, lli, si, H)
{
real scalar hi
real rowvector beta, lambda, g0, h
real colvector fp, fm
real matrix H0
// partition the param vector into beta and lambda
beta = param[1..p]
lambda = param[q..r]
// use a v0 evaluator to obtain lli (passing beta as a separate argument)
v0evaluator(0,lambda,beta,lli,si,H)
if (todo>0) {
si = J(N,r,.)
h = J(1,cols(lambda),.)
// obtain the scores for beta analytically
si[.,1..p] = [analytical expression for beta scores]
// obtain the scores for lambda numerically
// update S elements manually
*S.arglist[1] = beta
S.params = lambda
S.iter = optimize_result_iterations(R)
S.value = quadcolsum(lli)
// calculate scores (copied from internal -optimize()- code)
for (i=q; i<=r; i++) {
(void) opt__v0_h(S,i,hi,fp,fm,g0,H0)
h[i] = hi
si[,i] = (fp - fm)/(2*h[i])
}
}
}
/* End Code */
Interestingly enough, my first attempt at this managed to fool v1debug into
thinking that the v1 evaluator calculated the same numerical scores as the
v0 evaluator, when actually it didn't. I guess that's what happens when you
start messing around with structures that you're not supposed to.
Glenn.
-----Original Message-----
From: Glenn Goldsmith [mailto:[email protected]]
Sent: 21 February 2009 15:10
To: '[email protected]'
Subject: Combining analytical and numerical derivatives in mata -optmize()-
Dear list,
I am trying to code a maximum likelihood estimator using mata -optimize()-.
For most of the model parameters (call them betas) I have analytical
expressions for the gradients/scores. However, for a small subset of the
parameters (call them lambdas), I do not.
My hope was that instead of coding a v0 evaluator, which would use numerical
derivatives for both the betas and the lambdas, I could somehow take
advantage of the fact that I have analytical expressions for the beta scores
(and Hessian components), and only use numerical derivatives for the small
number of lambdas - potentially saving substantially on computation time.
Unfortunately, while it seems like this should be possible, I can't quite
get it to work. The approach I've tried to take so far is set out below. I'd
be very grateful for:
1. any advice on how to salvage my approach (or on whether it is salvageable
at all); or
2. any suggestions as to whether there might be another way to achieve what
I trying do here.
The thought was to construct a v1 evaluator (and ultimately a v2 evaluator,
but the problems arise at the v1 stage already, so I'll restrict discussion
to that) that uses an internal call to -optimize_evaluate()- to calculate
numerical scores with respect to lambda, but calculates everything else
analytically.
I've used v1debug to check the analytical scores, and they're correct.
However, the numerical scores generated by -optimize_evaluate()- only agree
with the numerical scores that -optimize()- calculates at the 0th iteration
(when they're identical). After that, they diverge. (Sometimes they're still
quite close, sometimes they're not.)
Is this to be expected, or does it mean I must have made a mistake
somewhere? Is the problem that -optimize()- somehow modifies its gradient
calculating routine at each iteration? And if that is the case, is there any
way to take advantage of this in my code?
A rough outline of the structure of my v1 evaluator is below (this is
drastically simpler than what I am actually dealing with, but should
hopefully convey the essence of what I'm trying to do).
/* Begin Code */
void v1evaluator(todo, param, lli, si, H)
{
// partition the param vector into beta and lambda
beta = param[1..p]
lambda = param[q..r]
// use a v0 evaluator to obtain lli (passing beta as a separate argument)
v0evaluator(0,lambda,beta,lli,si,H)
if (todo>0) {
si = J(N,r,.)
// obtain the scores for beta analytically
si[.,1..p] = [analytical expression for beta scores]
// obtain the scores for lambda numerically, using -optimize_evaluate()-
S = optimize_init()
optimize_init_evaluator(S,&v0evaluator())
optimize_init_evaluatortype(S,"v0")
optimize_init_params(S,lambda)
optimize_init_argument(S,1,beta)
(void) optimize_evaluate(S)
si[.,q..r] = optimize_result_scores(S)
}
}
/* End Code */
Independently of the problems noted above, this is at least slightly
inefficient, in that -optimize_evaluate()- automatically calculates the
Hessian, which is unnecessary here. Specifying
-optimize_init_technique(S,"bhhh")- prevents that. However, it may still be
that the -optimize_evaluate()- call is an inefficient way of doing what I
want, even if I could get it to work.
Best wishes,
Glenn
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/