Jesper Kj�r Hansen <[email protected]> is using Mata and wants to
efficiently select rows or columns of a matrix according to a vector
containing 1s and 0s.
In the the particular example he cites, he is dropping missing values,
but he is interested in the general solution as well. The example he offers
is
> : x = (1, ., 0, ., 1, 0)'
>
> : (x , x :!= .)
> 1 2
> +---------+
> 1 | 1 1 |
> 2 | . 0 |
> 3 | 0 1 |
> 4 | . 0 |
> 5 | 1 1 |
> 6 | 0 1 |
> +---------+
and he asks,
> what is the most efficient way of getting:
>
> 1
> +-----+
> 1 | 1 |
> 2 | 0 |
> 3 | 1 |
> 4 | 0 |
> +-----+
There is no good answer. Mata very much needs a select() function, and we will
commit right now to adding that to Mata. The way the select() function will
work will be
result = select(A, v)
where
A: r1 x c1
v: r1 x 1 or 1 x c1 containing zero and nonzero.
result: r2 x c1 or r1 x c2, r2<=r1, c2<=c1
Thus, with select(), Jesper will be able to obtain his desired result
by coding select(x, x :!=.).
Actually, the problem of excluding rows with missing values occurs so often
that we will also include the function
X = excludemissing(A)
where
A: r1 x c1
result: r2 x c1, r2<=r1
So what should Jesper do in the meantime?
Jesper should write his own -select()- and -excludemissing()- to these
specifications. Actually, we have included them below. Jesper's code will
then work, but it will not especially fast.
Later (soon), when we release the built-in select() and excludemissing(),
Jesper can remove remove his versions and recompile.
-- Bill David
[email protected] [email protected]
The following -select()- and -excludemissing()- functions are *NOT* fast,
but they match the definition of the official -select()- and -excludemissing()-
functions which will be added to Mata and will be fast.
transmorphic matrix select(transmorphic matrix A, real vector v)
{
real scalar r1, c1, i1, i2
transmorphic matrix B
r1 = rows(A)
c1 = cols(A)
if (cols(v)==1) {
if (r1 != rows(v)) _error(3200)
B = J(colsum(v:!=0), cols(A), missingof(A))
for (i1=i2=1; i1<=r1; i1++) {
if (v[i1]) B[i2++,.] = A[i1,.]
}
}
else {
if (c1 != cols(v)) _error(3200)
B = J(rows(A), rowsum(v:!=0), missingof(A))
for (i1=i2=1; i1<=c1; i1++) {
if (v[i1]) B[.,i2++] = A[.,i1]
}
}
return(B)
}
numeric matrix excludemissing(numeric matrix A)
{
return(select(A, rowmissing(A):==0))
}
<end>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/