Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: still for Stata 14: a cache of sorted orderings for big data


From   László Sándor <[email protected]>
To   [email protected]
Subject   st: still for Stata 14: a cache of sorted orderings for big data
Date   Thu, 12 Sep 2013 17:38:47 -0400

Following up on the previous note, I think sort is just as bad an idea
for big data as a preserve-and-restore cycle. I could imagine an
option where I can allow Stata to save the last few sort
orderings/ranks even though it takes some memory, and then checks
whether the sorting variables change or a new sort is needed because
of a different sample restriction but otherwise quickly restores the
ordering.

I see why sort helps -by- (or even -tab-, perhaps), and is essential
for other tools like -xtile- or -mkspline-, but it is still a drag on
tens of millions of observations.

Thanks,

Laszlo
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index