|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: re: building a 'dream' stata desktop setup
.
Thank you for the very interesting edification. I did not realize the
difference in performance between the recently released software and
roadmap of Intel and what it really means to take advantage of
multiple cores efficiently. Just for the record, I happily paid for
Stata/MP for 2 cores, and I will have the same attitude if more cores
land in front of me. One more question: will you be taking advantage
of graphics processors? I've read this is another source to increase
computations in most computers these days.
-Dave
On Jul 8, 2008, at 1:41 PM, William Gould, StataCorp LP wrote:
David Airey <[email protected]> wrote,
[...] Intel has now recommended programmers prepare their code for
more
cores than currently on the market or imaginable (i.e., 100s to
1000s). What
are we going to pay for Stata then? Clearly, Stata is charging more
because
they can and those who buy 8 core machines have money in their
pockets. When
it is the norm to have a larger number of cores, prices will not be
by the
core, or no one will buy Stata.
I suspect David is imagining that all that was required to produce
Stata/MP
was recompiling Stata by specifying a compiler option and then
selling the
product. If that were the case, I would agree with David.
That is not what we did. Stata/MP was a major rewrite of Stata, the
purpose
of which was to work directly with the multiple cores. This
involved not just
parallelizing code, but deciding where and how deeply to
parallelize, and
rewriting computation algorithms to be amenable to parallelization.
Stata/MP was a major effort and it still is. Multiple developers
work full
time parallelizing more and more of Stata.
In fact, nowadays one could produce a multiprocessor product simply by
compiling single-processor code using a sophisticated compilers just
released
in the last few months. The latest Intel compiler has just such a
feature,
and as a result, we may be about to see programs, including
statistical
packages, that run on "all the cores".
The problem is, such automatic techniques for producing parallel
software does
not work nearly as well as custom coding efforts such as those
performed
for Stata/MP.
Here's a table:
-------------------- run time
-------------------
-- Stata/MP - Automatic
method
Processors Perfect MP-A MP-E Alt. 1
Alt. 2
-------------------------------000----------------------------------
1 1.00 1.00 1.00 1.00
1.00
2 .50 .72 .57 .
94 .87
4 .25 .50 .35 .
90 .81
8 .125 .42 .24 .
89 .77
40 .025 .35 .15 .
87 .75
400 .003 .33 .13 .
87 .74
4,000 .0003 .33 .13 .
87 .74
--------------------------------------------------------------------
Note: Parallelizeable regions are 100% for Perfect, 66.6%
MP-A, 87% for MP-E, 13% for Alt. 1, and 26% for Alt. 2.
Numbers for Stata/MP based on actual measurement. MP-A
reports results for all Stata commands. MP-E reports
results for all estimation commands.
Alt. 1 is a generous estimates of what can be achieved by
automatic compiler methods today.
Alt. 2 is a generous estimate of what may be achievable by
automatic compiler methods in the future.
Alternatives 1 and 2 above are admittedly made up, but they have
been made up
generously. Alternative 1, for instance, is supposed to be what is
achievable
by today's compilers, yet using the current Intel compiler, we
cannot achieve
such results. The results reported in the Alternative 2 column are
about
twice as good as we think are theoretically possible with automated
methods.
The numbers in the Stata/MP column are overall observed averages with
an extrapolation to 400 and 4,000 processors.
I admit I am in the process of setting up a straw man and knocking him
over. I am setting up the straw man because I suspect the "specify
the
option and recompile" model is, unconciously, the underlying
assumption in
everyone's mind when first thinking about this issue.
So let's understand the implications of the table. Stata/MP running
on two cores produces better performance than either automatic
alternative
running even on 4,000 cores. Stata/MP on four cores does even better,
and indeed we are charging you for that.
David is right when he states, "Stata is charging more because they
can and
those who buy 8 core machines have money in their pockets". I would
say it
differently, of course. I would say that Stata with 4 cores
produces a lot
more performance than Stata with 1 or 2 cores, and so the price is
justified.
In part, the price is justified because making parallel algorithms
work
efficiently on more than two cores requires a surprising amount of
extra work. The problem is, you don't necessarily want to run on all
of them because the setup costs could be too great. Instead, you must
develop a subsystem that decides problem-by-problem, based on current
conditions, exactly how many processors should be used for each little
piece of the calculation.
Nonetheless, David would be absolutely correct to say to that
StataCorp chose
to charge more for 4-core Stata than 2-core than costs could
justify. That's
always the case with software: the cost of development is an up-
front cost
and afterwards, prices are set to spread those costs (and profits)
in ways
that seem equitable.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/