Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: how to parallelize Mata (or steal the performance of built-in -tab, summarize-)
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: how to parallelize Mata (or steal the performance of built-in -tab, summarize-)
Date
Tue, 3 Apr 2012 10:01:48 +0100
Overnight I remembered -binsm-
SJ-6-1 gr26_1 . . . . . . . . . . . . . . . . . . Software update for binsm
(help binsm if installed) . . . . . . . . . . . . . . . . . N. J. Cox
Q1/06 SJ 6(1):151
rewritten to support modern Stata graphics
STB-37 gr26 . . . . . . . . . . . Bin smoothing and summary on scatter plots
(help binsm if installed) . . . . . . . . . . . . . . . . . N. J. Cox
5/97 pp.9--12; STB Reprints Vol 7, pp.59--63
alternative to graph, twoway bands(); produces a scatterplot
of yvar against xvar with one or more summaries of yvar for bins
of xvar
and -twoway__histogram_gen-
SJ-5-2 gr0014 . . . . . . . Stata tip 20: Generating histogram bin variables
. . . . . . . . . . . . . . . . . . . . . . . . . . . . D. A. Harrison
Q2/05 SJ 5(2):280--281 (no commands)
tip illustrating the use of twoway__histogram_gen for
creation of complex histograms and other graphs or tables
My strategic advice is this. You want a reduced dataset for graphing,
so -drop- aggressively. Once you have identified observations "to
use", go
keep if `touse'
drop `touse'
Once the mean is in the last observation of every block of
observations, -drop- all the others.
2012/4/3 László Sándor <[email protected]>:
> Thanks for this, Nick.
>
> I found my (plenty and embarrassing) mistakes in my code, below is a
> neater version that also actually does what it should, or so it seems.
>
> That said, it is still rarely faster than logging -tab, sum()- though
> with many millions of observations, running on many (>4) cores, it at
> least has a little advantage. (But both beat my bare bones Mata
> attempts.)
>
> I would still be a bit curious how secret the secret sauce of
> StataCorp is for this, as this "collapsing" is pretty commonplace for
> many descriptives (also bar graphs, line graphs etc), and while they
> are rightly proud if they could tweak -tabulate- to run this fast,
> they perhaps could let us (and themselves) working towards other
> similar code also running faster. Though, of course, there must be a
> reason (general purpose etc.) while this is harder elsewhere.
>
> Thanks again,
>
> Laszlo
>
> tempvar wsum tag
>
> if ("`y2_var'"!="") local y2 y2
> else local y2 ""
>
> sort `x_q' `touse'
> by `x_q' `touse': g byte `tag' = _n == _N
> if ("`weight1'"!="") by `x_q' `touse': g `wsum' = sum(`weight1')
> else by `x_q' `touse': g `wsum' = _N
>
> foreach v in x y `y2' {
> if ("`weight1'"!="") by `x_q' `touse': g ``v'_mean' = sum(``v'_r'*`weight1')
> else by `x_q' `touse': g ``v'_mean' = sum(``v'_r')
>
> quietly replace ``v'_mean' = cond(`tag' & `touse',``v'_mean'/`wsum',.)
> }
>
> On Mon, Apr 2, 2012 at 6:11 PM, Nick Cox <[email protected]> wrote:
>>
>> I will look at it tomorrow.
>>
>> 2012/4/2 László Sándor <[email protected]>:
>> > Nick,
>> >
>> > thanks, I did follow up with your post. Sadly, I could not easily get
>> > -by- working, or to be precise, to use the variables that it
>> > generated. Below I have an attempt, if I can take liberty with your
>> > time and expect you to parse it, I am grateful for comments to get it
>> > working -- the indexing must be off. It tries to average two (x_r and
>> > y_r) or three (y2_r extra) variables. It generates too large values
>> > for some bins (i.e. from U[0,1] variables some averages become larger
>> > than 20.)
>> >
>> > I am happy if someone from StataCorp follows up too! :)
>> >
>> > Thanks,
>> >
>> > László
>> >
>> > tempvar wsum tag ones
>> > g byte `ones' = 1
>> >
>> >
>> > if ("`y2_var'"!="") local y2 y2
>> > else local y2 ""
>> >
>> >
>> > if ("`weight1'"!="") g `wsum' = sum(`weight1') if `touse'
>> > else g `wsum' = sum(`ones') if `touse'
>> >
>> >
>> > sort `x_q'
>> > by `x_q': g byte `tag' = _N if `touse'
>> >
>> > foreach v in x y `y2' {
>> > if "`weight1'"!=""{
>> > by `x_q': g ``v'_mean' = sum(``v'_r'*`weight1') if `touse'
>> > by `x_q': replace ``v'_mean' = ``v'_mean'/`wsum' if `tag' & `touse'
>> > }
>> >
>> > else {
>> > by `x_q': g ``v'_mean' = sum(``v'_r') if `touse'
>> > by `x_q': replace ``v'_mean' = ``v'_mean'/`wsum' if `tag' & `touse'
>> > }
>> > }
>> >
>> >
>> > On Mon, Apr 2, 2012 at 3:36 PM, Nick Cox <[email protected]> wrote:
>> >>
>> >> We are back to the questions you asked a week ago. Mostly this is for
>> >> StataCorp. Otherwise please see again my answers at
>> >>
>> >> http://www.stata.com/statalist/archive/2012-03/msg01144.html
>> >>
>> >> I've had dramatic speed-ups with Mata -- my record is reducing
>> >> execution time from 5 days to 2 minutes, but that was partly because
>> >> my original code was so dumb -- but I've not tried anything like the
>> >> stuff you were using.
>> >>
>> >> -tabulate, summarize- is compiled C code. I think the nearest you can
>> >> get is by using -by:- as explained in the post just quoted.
>> >>
>> >> Nick
>> >>
>> >> 2012/4/2 László Sándor <[email protected]>:
>> >> > Hi all,
>> >> >
>> >> > I had several questions recently on this list about compiling Mata
>> >> > code. I still could not deal with generating the compile time locals
>> >> > with loops, but I typed them out and compiled. Now I had my test runs
>> >> > but they are surprising. Let me ask you why:
>> >> >
>> >> > My basic problem was to do a fast "collapse" to make binned scatter
>> >> > plots. Collapse was unacceptably slow, probably because of the
>> >> > necessary preserve-restore cycles, or inefficient coding of collapse
>> >> > (for its general purpose).
>> >> >
>> >> > I already had a version that parsed a log of -tabulate, summarize-.
>> >> > Yes, it is as much of a hack as it sounds like. I was not expecting
>> >> > this to be fast, at least because of the file I/O and the parsing.
>> >> >
>> >> > Now I built a Mata function that "collapses" into new variables with
>> >> > leaving the data intact otherwise. For this I used Ben Jann's
>> >> > -mf_mm_collapse-, and compiled all the necessary functions myself in
>> >> > the ado file.
>> >> >
>> >> > And the test run with 100 million observations told me it was slower
>> >> > than the hack. Before I give up and claim the hack unbeatable, I have
>> >> > one suspicion. I had the test run on Stata 12 MP on a cluster, with
>> >> > 12
>> >> > cores. Perhaps -tabulate- used all of them, and my code did not.
>> >> >
>> >> > Are there guidelines how to speed up Mata in this situation (if it is
>> >> > not MP-aware to begin with?).
>> >> >
>> >> > Or, tentatively, can I ask for some guidance about the magic of
>> >> > -tabulate, summarize-? Is that magic accessible/reproducible without
>> >> > just logging its output?
>> >> >
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/