Matt Weinber <[email protected]> has written Mata function -qnewsolve()- and
asked two questions. I have already replied to his first question, which I
mention becomes sometimes replies arrive out of the intended order.
The releavant parts of Matt's program reads,
numeric matrix qnewsolve(pointer matrix f, numeric matrix x, ...)
{
...
return(x\rc)
}
and Matt asks,
[...] is there an easy way to return x and rc separately instead of tacking
rc to the end of x?
Unlike my reply to Matt's other question, this reply is mostly about good
programming style, about which reasonable people can disagree. In addition,
however, I SPOTTED A BUG in Matt's code. Look at the little bit of code I
quoted. Do you spot it? Well, I'll mention it last.
First, Opinion
--------------
Ben Jann <[email protected]> replied, "Use structures." I disagree.
In my opinion, structures are an elegant way of tying things together that
belong together, but in this case, I argue, x and rc do not belong together.
If I had a function that returned a parameter vector b and a variance matrix
V, I would find Ben's structure suggestion appealing (but see "A note on
structures", below). I would find it appealing because b and V belong
together. In calls to other subroutines, I can easily imagine that I will
want to pass along both, and I will want to store both together. In summary,
structures used this way are elegant because the resulting code reflects the
reality of the situation.
In this case, however, rc is dross. It is something the caller of
-qnewsolve()- needs to look at, but if it is as expected, the caller continues
and x stands alone. For the same reason, I do not much like Matt's
-return(x\rc)- solution.
So let's consider the general problem of returning multiple results that
do not belong together. x and rc is one example. Let's use b and V as
another, and ignore structures.
One solution is to put the returned results on equal footing -- pass them
as arguments, not because they contain anything, but with the sole intention
of defining them to contain our results. Our function will return (in
the programming return() sense) nothing:
xmpl1a(b, V, ...) (1)
xmpl1b(rc, x, ...)
or
xmpl2a(..., b, V) (2)
xmpl2b(..., x, rc)
I like this solution when the returned results are of equal importance.
In the b,V case, I find (1) and (2) more appealing than
b = xmpl3a(..., V) (3)
or the other way around, which somehow suggests that b is more important than
V. Moreover, I have another argument to favor (1) and (2) over (3): As a
user, when a function returns something (like b), I assume that it does not
change any of the other arguments that I pass it. I know I should look at the
decumentation, but I don't. When a function returns nothing, I know to be on
the lookout for changed arguments.
Another solution is,
bV = xmpl4(b, V) (4)
b = bV[1, .]
V = bV[| 2,1 \ .,. |]
but I do not like that because of how difficult it is to take back apart
and because V might be a large matrix and so (4) is wasteful of memory.
So we are back to (1) and (2). Between them, I do not have an opinion.
Usually I prefer (1) because I am used to seeing assignment on the left:
b = ...
But then I run into "copy" routines. In C, routines that end in -*cpy()-
always copy from the first argument to the last. Much of my taste has
been formed using C, and so whenever I fall into "copy" mode, I expect
the outputs to be on the right.
Anyway, when I have to write a routine that returns more than one
result, of equal importance, I tend to gather them together and put them
either at the beginning or the end according to which seems most natural.
Now let's consider x and rc. My usual preference is for
rc = xmpl5a(x, ...) (5)
rc = xmpl5b(..., x)
and that is because of how I will use the functions. I need to look
at rc before I look at x, and having looked at rc and found what I
expect, all I care about is x:
if (rc = xmpl45(x, ...)) {
/* there were problems */
}
... x ... /* and I never use rc again */
I told you I did not like (3), but in fact I do like (3) when what is
returned is a return code, error flag, and the like.
So my suggestion to Matt is to adapt (5) in his case.
Next there is the issue of input, output, and input/output variables. Let's
pretend Matt adopts by suggestion,
rc = qnewsolve(x, ...)
Question: Is x an output variable, or an input/output variable? Let's
assume -qnewsolve()- reuires an vector initial values for x. We could
imagine the following designs,
rc = qnewsolve(x, ...) x initial values and x returned
rc = qnewsolve(x, x0, ...) x0 initial values, x0 unchanged,
x final result
If I am required to provide initial values to use -qnewsolve()-, I prefer the
first solution, but do not much object to the second. If -qnewsolve()- can
produce initial values on its own, then I strongly prefer the second. In that
case, argument x0 might be optional, or it might be required but allowed to be
0 x 0.
A note on structures
--------------------
I wrote, "If I had a function that returned a parameter vector b and a
variance matrix V, I would find Ben's structure suggestion appealing."
In fact, I often use Ben's structure suggestion, but only for internal
subroutines that are undocumented and not directly called by users.
Structures scare users.
We at StataCorp will be working on alleviating that fear, but I do not expect
a lot of success. The long and short of it is that understanding structures
takes more time that it is worth for many. That does not mean structures are
not useful. I use them. We at StataCorp use them. Ben uses them. Many
advanced programmers use them, and will be using them more. What I am
suggesting is that users should not be required to understand structures to
use Mata, or to use end-user subroutines.
-qnewsolve()-, I suspect, falls into this category.
In my response to the previous question on -qnewsolve()-, I used structures to
solve the problem of calling user functions and passing additional arguments.
Note, however, the use of structures was all internal. Matt needed to
understand structures, but users of -qnewsolve()- did not, and in fact, will
never even know that structures were used in obtaining the result.
-qnewsolve()-, however, is an end-user function. If -qnewsolve()- returned
results in a structure, then the caller would have to know how to use them.
A bug in -qnewsolve()-
----------------------
Matt's current version of -qnewsolve()- reads,
numeric matrix qnewsolve(pointer matrix f, numeric matrix x, ...)
{
...
return(x\rc)
}
Let's assume that in some other program we used -qnewsolve()-:
...
x0 = ... initial values ...
...
soln = qnewsolve(&f(), x0, ...)
if (soln[length(soln)]) {
... (there were problems) ...
}
x = soln[| 1 \ length(soln)-1 |]
...
Would you expect -x0- to change aftrer the call to -qnewsolve()-? I argue you
would not, else why would -qnewsolve()- be returning x to us?
But, as Matt as outlined -qnewsolve()-, x0 will change. Function -qnewsolve()-
receives arguement x and returns x\rc, so -qnewsolve()- changes x.
We called -qnewsolve()- by coding
soln = qnewsolve(&f(), x0, ...)
so our variable x0 is -qnewsolve()-'s x. -qnewsolve()- changes x, so it
changes x0.
I'm sure that was not Matt's intention. The solution is
numeric matrix qnewsolve(pointer matrix f, numeric matrix userx, ...)
{
x = userx
...
return(x\rc)
}
Now the program receives a variable it calls userx, and x is its own,
private variable. The x returned by -qnewsolve()- is unrelated to
x0.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/