Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: mata function for "lookup" or find rank if observation not in the ranked sample
From
László Sándor <[email protected]>
To
[email protected]
Subject
Re: st: mata function for "lookup" or find rank if observation not in the ranked sample
Date
Mon, 14 May 2012 22:31:02 -0400
To follow-up on this:
I figured out a way with permutation functions, though it is messy
(FWIW, I paste it below).
More importantly, I was surprised to see that even in samples of 50K,
I suffered significant performance losses compared to minindexing over
entire N-vectors. I know minindex is built-in, pure C etc., but is it
reasonable that it can beat simple (well, simplish) lookup and some
logical checks, and a minindex over a much shorter vector?
I would very much appreciate some StataCorp comments on this. I
invested a bit into this new approach (more complicated from
cross-treatment matching) for coding (one-dimensional) matching, and
I'm surprised by the result.
Thanks,
Laszlo
***** Code with some explanation:
instead of simply using this to match treated observations close to
the treated in question:
minindex(abs(psc1:-psc1[i]),L,yki,ykw)
I do some work first out of the loop:
psc1order1 = order(psc1,1)
psc1invorder1 = invorder(psc1order1)
and then guess that the closest L (plus ties) might be in the 2L
closest observation below or above the rank of the current observation
in the ranking:
for ( lcl = 2*L; lcl<= 2*n1;lcl=2* lcl) {
if (lcl > n1) {
minindex(abs(psc1:-psc1[i]),L,yki,ykw)
break
}
else {
matchcandidateindices = psc1order1[|
max((psc1invorder1[i]-lcl,1)),1 \ min((psc1invorder1[i]+lcl,n1)),1|]
minindex(abs(psc1[matchcandidateindices]:-psc1[i]),L,yki,ykw)
if ( anyof(yki,1) | anyof(yki,rows(matchcandidateindices))) {
continue
}
else {
yki = matchcandidateindices[yki]
break
}
}
}
On Mon, May 14, 2012 at 11:04 AM, László Sándor <[email protected]> wrote:
> Hi,
> I have another puzzle in Mata for Stata 10 and above. My previous
> thread on this list help me match, say, treated observations to
> treated observations, as I look up observations with propensity score
> close to the observation's in a (selected) vector that the observation
> itself is part of. The use of permutation vectors can be a dramatic
> improvement compared to many-many runs of -mindex- on the full
> propensity score vector.
>
> However, just as important would be to find observations with similar
> propensity scores in the subsample with opposite treatment. The
> observation itself is not part of that subsample, so I cannot simply
> look up its own rank there in a permutation vector I generate only
> once before I loop through all observations. Somehow I would need to
> know the rank of, say, a treated observation in the subsample of the
> control group. Actually, to really spare me costly runs on the entire
> control-prop.score-vector for each (treated) iteration, this lookup
> would better use some information I can generate for the whole
> population and the ranks there and/or in the two subsamples.
>
> Please let me know if you have thoughts about this.
>
> Thanks!
>
> Laszlo
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/