Uli and Magdalena Luniak <[email protected]> asked about structures
in Mata.
They ran into a bug. Had they coded a little more efficiently, they never
would have run into it, but that does not excuse the bug.
Where's what they did: They had a vector of structures: v[1] was the
first struct, v[2] was the second. The filled in a third structure, mypoint,
and then stored mypoint in v[1]:
v[1] = mypoint.
All went well. The then filled in mypoint with a different set of values,
and coded
v[2] = mypoint.
That worked well, too, except that v[1] also changed, and it changed to be
the same as v[2], namely, mypoint!
Uli and Magdalena made no errors; Mata did. Rather than storing a copy of
mypoint in v[1], and then later, a copy of mypoint in v[2], Mata mistakenly
stored mypoint itself in v[1] and v[2]. v[1], v[2], and mypoint all became
the same object.
I have just examined this bug in detail. It occurs when the RHS is a
structure and the LHS is an element of a structure vector or matrix, i.e.,
statements of the form,
v[i] = mypoint
v[i,j] = mypoint
It does *NOT* occur when the LHS is a scalar,
v = mypoint
Until the bug is fixed, the workaround is to make the copy that Mata forgot
to make:
Rather than code
v[i] = mypoint
code
v[i] = copyof(mypoint)
and rather than code
v[i,j] = mypoint
code
v[i,j] = copyof(mypoint)
where function copyof() is coded
transmorphic copyof(transmorphic original)
{
transmorphic copy
copy = original
return(copy)
}
In Uli's and Magdalena's case, they have a second alternative. They can make
their code more efficient and not provoke the bug. Their original code reads,
struct point vector function help(real vector seq)
{
real scalar length
length = length(seq)
struct point vector v
v = point(length)
real scalar i
struct point scalar mypoint
for (i=1; i<=length; i++) {
mypoint.a=seq[i]
mypoint.b=seq[i]
v[i] = mypoint
}
return(v)
}
I prefer all the declarations up top. It is just a matter of style, and not
even good style vs. bad style, but indulge me, and let me change their code to
my preferred style before getting to my point:
struct point vector function help(real vector seq)
{
real scalar i
real scalar length
struct point vector v
struct point scalar mypoint
length = length(seq)
v = point(length)
for (i=1; i<=length; i++) {
mypoint.a=seq[i]
mypoint.b=seq[i]
v[i] = mypoint
}
return(v)
}
Style aside, a more efficient version of thier code reads,
struct point vector function help(real vector seq)
{
real scalar i
real scalar length
struct point vector v
length = length(seq)
v = point(length)
for (i=1; i<=length; i++) {
v[i].a = seq[i]
v[i].b = seq[i]
}
return(v)
}
Did you know you could do that? Refer to v[i].a and v[i].b? On the
left or on the right?
Pretend v[i] had a third element, a vector c. Then you could refer to
v[i].c[j] and v[j].c[i] (which would be different things).
I know, I'm changing the subject. We will fix the bug, but it will not be in
the next executable update. It will be in the one after that.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/