Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: uniform distribution
From
"PAPANIKOLAOU P." <[email protected]>
To
<[email protected]>
Subject
RE: st: uniform distribution
Date
Sat, 9 Nov 2013 14:31:15 -0000
Dear Niko,
Could you please be kind enough to enlighten me on the issue of
time-series (my data) and its impact on testing that the distribution is
uniform?
I would appreciate your input.
Kind regards
Panos
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nikos
Kakouros
Sent: 09 November 2013 14:15
To: [email protected]
Subject: Re: st: uniform distribution
David,
Thanks! That is a very neat property.
Of course, I had to see it in action... ;-) set obs 50000 gen
nnorm=rnormal(0,1) gen n_nnorm=normal(nnorm) histogram n_nnorm
n_norm looks pretty uniform ;-)
So it it starts non-uniform it will end up not quite so normal the other
way around. I wonder however whether a test for a departure from
normality for the Finv(U) can really accurately test for U's departure
from uniformity. Will the p's be accurate?
Nick Cox has, of course, in the meantime questioned the entire
applicability of uniform distribution testing given the nature of the
originally presented data (time series).
Many thanks for explaining this nice property!
Nikos
On Sat, Nov 9, 2013 at 8:43 AM, David Hoaglin <[email protected]>
wrote:
> Nikos,
>
> No approximation to the binomial distribution is involved.
>
> The approach uses a basic property of (continuous) probability
> distributions. If X is an observation from a distribution whose
> cumulative distribution function (c.d.f.) is F, then U = F(X) has a
> uniform(0,1) distribution. This is, I am transforming X by using the
> c.d.f. of its own distribution. This holds for any continuous
> distribution, not just the normal distribution.
>
> The reverse of the above process starts with an observation U from
> uniform(0,1) and transforms it by the inverse of the c.d.f. of the
> particular distribution (call it Finv). Then X = Finv(U) is an
> observation from the particular distribution. This is what Fernando
> suggested. Of course, he did not assume that, when compressed onto
> the interval [0,1], mpg would have a uniform distribution. The idea
> is that a departure from uniformity will show up as a departure from
> normality after transforming the uniformized data by invnorm. A
> little problem may arise at the ends of the interval, though:
> theoretically, invnorm(0) = minus infinity and invnorm(1) = infinity.
>
> People often make "probability plots" and handle that problem by using
> "plotting positions" that do not go quite as low as 0 or as high as 1.
> In making a probability plot (or "quantile-quantile plot") for a
> sample of n observations vs. the uniform distribution, I would do the
> following:
> 1. Sort the observations from smallest to largest, index those with i
> = 1 through i = n, and denote them by x(1), ..., x(n).
> 2. Calculate the corresponding plotting positions from the formula
> pp(i) = (i - (1/3))/(n + (1/3)).
> 3. Make a scatterplot of the points (pp(i), x(i)).
> 4. Assess departures from uniformity by comparing the pattern in that
> plot against a straight line.
> 5. To get a feel for how such plots look when the data are actually
> uniform, simulate a number of samples of n from the uniform(0,1)
> distribution and make that plot for each sample.
> (Quantile-quantile plots for non-uniform distributions use the same
> approach. They use Finv(pp(i)) as horizontal coordinate of the plot.)
>
> David Hoaglin
>
> On Sat, Nov 9, 2013 at 7:58 AM, Nikos Kakouros <[email protected]>
wrote:
>> Fernando,
>>
>> That seems to work pretty well (did a run below).
>> I'm not entirely sure why it should work though.
>>
>> Is it because the normal distribution in this case works as an
>> approximation to the binomial distribution?
>>
>> Nikos
>>
>>
>>
>> set obs 50000
>> gen test=runiform()
>> sort test
>> histogram test
>> gen n_test=invnormal(test)
>> histogram n_test, normal
>> swilk n_test
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/