Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Stata MP


From   Hua Peng <[email protected]>
To   [email protected]
Subject   Re: st: Stata MP
Date   Tue, 20 Feb 2007 14:25:11 -0600

Fred Wolfe wrote:
Stata documents the increase in processing speed for estimation commands, but only for a few non-estimation commands (generate, replace). Does Stata Corp have data on performance of functions, foreach, forvalues, merge, egens, etc. We reserve one computer for data management and interface between SQL and Stata, and were wondering how much increase in performance we would see for data management tasks. We are thinking of 64 bit Vista.

Thanks.


Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
Tel +1 316 263 2125
[email protected]

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

We concentrated our efforts on estimation commands because they benefited the most from parallel processing. For non-estimation commands, we parallelized -generate-, -replace-, -by generate-, and -by replace-. We did not parallelize commands like -foreach- and -forvalues- because they can be essentially sequential, i.e., one iteration depends on the previous iterations. Any attempt to parallelize -foreach- and -forvalues- would require a new directive to mark if the loop can be parallelized or not in the do/ado code. (Our design explicitly ruled out such markers. We imposed the restriction that the do/ado code be identical for Stata/MP and Stata/SE.)

Unfortunately, if your task spends most of the time moving data back and
forth from an SQL sever to Stata, it might not benefit much from Stata/MP.

Some parts of -merge- and some -egen- commands can be parallelized. But we feel the bottleneck of -merge- is the disk IO instead of Stata internal processing. To the extent that an -egen- command uses -generate- and -replace-, it is already parallelized. For some group-level -egen- commands, a higher degree of parallelization can be obtained by using the equivalent -by group: generate...- commands.

--Hua
[email protected]

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index