Dear list,
I am trying to code a maximum likelihood estimator using mata -optimize()-.
For most of the model parameters (call them betas) I have analytical
expressions for the gradients/scores. However, for a small subset of the
parameters (call them lambdas), I do not.
My hope was that instead of coding a v0 evaluator, which would use numerical
derivatives for both the betas and the lambdas, I could somehow take
advantage of the fact that I have analytical expressions for the beta scores
(and Hessian components), and only use numerical derivatives for the small
number of lambdas - potentially saving substantially on computation time.
Unfortunately, while it seems like this should be possible, I can't quite
get it to work. The approach I've tried to take so far is set out below. I'd
be very grateful for:
1. any advice on how to salvage my approach (or on whether it is salvageable
at all); or
2. any suggestions as to whether there might be another way to achieve what
I trying do here.
The thought was to construct a v1 evaluator (and ultimately a v2 evaluator,
but the problems arise at the v1 stage already, so I'll restrict discussion
to that) that uses an internal call to -optimize_evaluate()- to calculate
numerical scores with respect to lambda, but calculates everything else
analytically.
I've used v1debug to check the analytical scores, and they're correct.
However, the numerical scores generated by -optimize_evaluate()- only agree
with the numerical scores that -optimize()- calculates at the 0th iteration
(when they're identical). After that, they diverge. (Sometimes they're still
quite close, sometimes they're not.)
Is this to be expected, or does it mean I must have made a mistake
somewhere? Is the problem that -optimize()- somehow modifies its gradient
calculating routine at each iteration? And if that is the case, is there any
way to take advantage of this in my code?
A rough outline of the structure of my v1 evaluator is below (this is
drastically simpler than what I am actually dealing with, but should
hopefully convey the essence of what I'm trying to do).
/* Begin Code */
void v1evaluator(todo, param, lli, si, H)
{
// partition the param vector into beta and lambda
beta = param[1..p]
lambda = param[q..r]
// use a v0 evaluator to obtain lli (passing beta as a separate argument)
v0evaluator(0,lambda,beta,lli,si,H)
if (todo>0) {
si = J(N,r,.)
// obtain the scores for beta analytically
si[.,1..p] = [analytical expression for beta scores]
// obtain the scores for lambda numerically, using -optimize_evaluate()-
S = optimize_init()
optimize_init_evaluator(S,&v0evaluator())
optimize_init_evaluatortype(S,"v0")
optimize_init_params(S,lambda)
optimize_init_argument(S,1,beta)
(void) optimize_evaluate(S)
si[.,q..r] = optimize_result_scores(S)
}
}
/* End Code */
Independently of the problems noted above, this is at least slightly
inefficient, in that -optimize_evaluate()- automatically calculates the
Hessian, which is unnecessary here. Specifying
-optimize_init_technique(S,"bhhh")- prevents that. However, it may still be
that the -optimize_evaluate()- call is an inefficient way of doing what I
want, even if I could get it to work.
Best wishes,
Glenn
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/