I know that the following was a breakthrough paper speeding up
hypergeometric computations quite a bit:
http://math.mit.edu/~plamen/files/hyper.pdf. Marcello, this guy is in
your geographic area, so you can get him out for a coffee or something
to see if there are any fast algorithms to work out in [St|M]ata.
On 8/23/07, Mike Lacy <[email protected]> wrote:
> >Date: Wed, 22 Aug 2007 15:32:38 +0100
> >From: "Newson, Roger B" <[email protected]>
> >Subject: st: RE: Hypergeometric Distribution
> >
> >Thanks to Marcello for telling us all about this recently-published
> >algorithm, which looks very useful. A search on
> >
> >findit hypergeometric
> >
> >in Stata finds a single reference (to a SSC package), which was
> >distributed as long ago as 1999. This suggests that the new algorithm
> >might be a good candidate for implementation in Mata by Marcello, or by
> >anybody else with the time and inclination to do so.
>
> I have something similar and could use a collaborator:
>
> I have a Stata program to calculate hypergeometric probabilities
> using the algorithm of:
>
> Berry, K. J., & Mielke, P. W. (1983). A rapid FORTRAN
> subroutine for the Fisher exact probability test. Educational and
> Psychological
> Measurement,43, 167-171.
>
> Their algorithm exploits a recursion, and so avoids the calculation
> of any factorials or log factorials in calculating the
> hypergeometric. I suspect it is faster than even the newly published
> algorithm, although I don't know. It is particularly suited to
> applications in which the entire vector of probabilities across the
> range of the variable is needed (e.g., Fisher's Exact), since it has
> to calculate all the probabilities to get just one of them. However,
> it can do all the probabilities for a variable with what I believe is
> lower O() complexity than a conventional algorithm would calculate
> any single one.
>
> I am not a "production quality" Stata programmer, and don't want to
> take the time to be one, so if anyone else is interested, I'd be
> happy to send them my code to be dressed up for public use. I
> considered posting the program to the list (only about 40 lines), but
> didn't know if that was quite appropriate.
>
--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/