|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: implementation of boschloo's test: very slow execution
...
I have a couple of potential speed imrpovement ideas (besides moving it to
Mata):
1) Do you really need to get the p value to 4 decimals? If you changed
qui gen double p = (_n-1)/10001 to qui gen double p = (_n-1)/1001
Then you have a 90% reduction in some of the calculations.
2) I don't know much about this test, but wouldn't the optimum point be a smooth
function of p? If that is so, you may want to create an iterative approach to
narrowing the range of p. Start with perhaps p ranging by .01. Then just keep
the interval on either side of the optimum and reduce the increment to .001.
That may reduce the calculations substantially.
3) Don't use tabi. -tabi- requires a preserve and has a lot of ado machinery to
set up the desired table. It ends up creating a dataset with RxC observations
and a frequency weight variable that contains the counts. You could instead
hardwire a 2x2 table with values for row, col, and fw. A loop would then just
change the fw values to cycle through all of the n1 and n2 values. You could
then use -post- to post the values of p_exact, xx1 and xx2. You may then be
able to do the binomial calculation and summation just once on the resulting
dastaset.
I'm not sure I understand the algorithm fully enough to be sure how these
changes would work or could be optimized and combined, but I wouldn't be
suprised with a large speed improvement.
Michael Blasnik
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/