I'm not sure how fast -sign()- is, but it's one
step away from C code and it's not clear to me that
replacing the function call by a bunch of Stata statements that
all have to be interpreted many times over will be much of a gain.
Also, and more crucially, the result of -sign()- _could_ be
different in different observations each time round the loop.
The result of your scalar can _only_ be constant each time
round.
In fact, I can't see why
scalar pairprod = (`x' - `x'[`k'])*(`y' - `y'[`k']) in 1/`kk'
would work at all.
At a quick glance, which may do your coding an injustice,
that looks like a bug.
Nick
[email protected]
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Mike Lacy
> Sent: 04 February 2004 00:44
> To: [email protected]
> Subject: st: Problem in modifying built-in ktau.ado code (for
> Kendall's
> tau)
>
>
> Greetings,
>
> I need to calculate Kendall's tau as efficiently as possible on two
> continuous variables for each repetition in a large set of bootstrap
> experiments. Calculating Kendall's tau is slow, being of
> order (N^2)/2. My
> estimate is that using the built-in ktau.ado command, which
> is the fastest
> command to obtain Kendall's tau in Stata, the bootstrap
> experiments will
> take at least 10-15 days of dedicated time on a fast Wintel machine.
>
> Consequently, I am trying to modify the code of the ktau.ado
> into my own
> version. In the middle of the loop in ktau.ado, I discovered
> there is a
> call to the sign() function, which accounts for about 70% of
> the execution
> time of ktau.ado. My idea was to avoid the overhead of a
> function call by
> replacing the sign() function with an equivalent set of If
> statements. So,
> I calculate my own "mysign" function value using Ifs and use
> it in place of
> sign() in the original code. However, I am not getting the
> right results,
> as I explain below after presenting the two relevant code fragments:
>
> * Built-in ktau.ado code goes like this:
> * View with a about a 10 pt nonproportional font
> * `x' and `y' reference the two variables
> * passed to the ktau command
> gen double `work' = 0
> scalar `k' = 2
> while (`k' <= `N') {
> local kk = `k' - 1
> #delimit ;
> replace `work' = `work'
> + sign((`x' - `x'[`k'])*(`y' - `y'[`k'])) /* slow */
> in 1/`kk' ;
> #delimit cr
> scalar `k' = `k' + 1
> }
>
> Below is what I tried to do. And yes, I know that
> if/else would be faster but I was trying to simplify
> my code as much as possible while trying to track down
> my error.
>
> * My code, with differences indicated
> gen double `work' = 0
> scalar `k' = 2
> while (`k' <= `N') {
> local kk = `k' - 1
> * Calc "mysign" as a substitute for the sign() function
> * The rest of the code is as untouched as possible
> scalar pairprod = (`x' - `x'[`k'])*(`y' - `y'[`k']) in 1/`kk'
> if (pairprod < 0) {scalar mysign = -1}
> if (pairprod > 0) {scalar mysign = 1}
> if (pairprod == 0) {scalar mysign = 0}
> if (pairprod == .) {scalar mysign = .}
> *
> #delimit ;
> replace `work' = `work'
> + mysign
> /* + sign((`x' - `x'[`k'])*(`y' - `y'[`k'])) Removed */
> in 1/`kk' ;
> #delimit cr
> scalar `k' = `k' + 1
> }
>
> Problem: I have checked the values of "mysign" vs. sign{) and
> they are the
> same for each iteration of the loop. However, at the end of
> the while loop,
> the values in the variable `work' are not what they should
> be, so I presume
> the problem is with what I have done with the "replace"
> command. Could
> someone offer some suggestion?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/