Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: faster xtiling
From
Maarten Buis <[email protected]>
To
[email protected]
Subject
Re: st: faster xtiling
Date
Fri, 7 Sep 2012 17:50:15 +0200
On Fri, Sep 7, 2012 at 5:04 PM, László Sándor wrote:
> I am trying to speed up -xtile- for Stata 11 and above for all
> platforms (for internal use) used with tens of millions of
> observations.
>
> I checked the source of -xtile-, and I am not sure I understand all
> its purpose. Most importantly, it does sort the data (a no-no with
> data the size of mine), even though the crucial step of _pctile does
> not need presorted data.
The sorting only happens if you asked for more than 1,001 quantiles,
so that suggests to me that there is some limitation in _pctile that
makes that necessary. If it were just laziness/sloppiness than it
would be extremely unlikely that the code would have been written that
way.
> And while I am at it, I am also happy to hear comments about the
> prospects of using Mata for any of this. _pctile is built-in,
> optimized, tailored, tweaked, polished C code, so there is little hope
> that Mata might improve the crucial steps, right?
As to the properties of -pctile, only StataCorp can say anything about
that, as we cannot see its content any more than you can.
-- Maarten
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/