Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: approximate quantiles in Stata

From	László Sándor <sandorl@gmail.com>
To	statalist@hsphsun2.harvard.edu
Subject	Re: st: approximate quantiles in Stata
Date	Sat, 24 Aug 2013 07:42:16 -0400

Thanks, David,

I think the typical use case is about tens of millions of
observations. (And as I think it matters for precision, the typical
case is about 20 bins, or vingtiles.)

FWIW, I also tried to profile -xtile- with maximum number of
observations possible. With Stata 13 running on 64 cores, it took 7
hours to generate vingtiles.

Laszlo

On Fri, Aug 23, 2013 at 9:54 PM, David Hoaglin <dchoaglin@gmail.com> wrote:
> Hi, Laszlo.
>
> How large are your samples, and which quantiles do you need?
>
> I think I saw some relevant work a number of years ago, and I will
> have to look for it.
>
> David Hoaglin
>
> On Fri, Aug 23, 2013 at 9:01 PM, László Sándor <sandorl@gmail.com> wrote:
>> Hi,
>> My work is slowed down by the precise but computationally intensive
>> quantile calculation of Stata. I am curious if there are any
>> approximation algorithms implemented out there, something along these
>> ideas: http://www.prelert.com/blog/q-digest-an-algorithm-for-computing-approximate-quantiles-on-a-collection-of-integers/
>>
>> So this is not about estimating population quantiles from a small
>> sample (see Nick's hdquantile on SSC, e.g.). This is about finding
>> approximate quantiles in large data.
>>
>> If the answer is simply random downsampling before taking quantiles, I
>> would still appreciate some guidance on how heavily to downsample as a
>> function of population size.
>>
>> Thanks!
>>
>> Laszlo
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: approximate quantiles in Stata
  - From: David Hoaglin <dchoaglin@gmail.com>
- Re: st: approximate quantiles in Stata
  - From: László Sándor <sandorl@gmail.com>

References:
- st: approximate quantiles in Stata
  - From: László Sándor <sandorl@gmail.com>
- Re: st: approximate quantiles in Stata
  - From: David Hoaglin <dchoaglin@gmail.com>

Prev by Date: Re: st: Can I control for time invariant industry effects and time invariant country effects at the same time?
Next by Date: st: Questions for Subsample
Previous by thread: Re: st: approximate quantiles in Stata
Next by thread: Re: st: approximate quantiles in Stata
Index(es):
- Date
- Thread