Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]

For one fund, suppose we take 200506 Week 1 through 200512 Week 26 as
Variable 1 (with 52 observations) and 200606 Week 1 through 200612
Week 26 as Variable 2.  We would obtain the sample quantiles for
Variable 1 by putting its 52 values in numerical order from smallest
(min1) to largest (max1).  Separately putting the 52 values of
Variable 2 in order would produce min2, ..., max2.  The empirical
quantile-quantile plot (Q-Q plot) is based on the 52 points (min1,
min2), ..., (max1, max2).  If the two samples come from distributions
that have the same shape, the pattern in the plot will resemble a
straight line.  (By "shape" I mean a family of distributions whose
members differ only in location and scale.  For example, the normal
distributions are a single shape.)

In statistical language, we are working with the order statistics of
the two samples.  To simplify the notation, call the variables x and
y, with values x1, x2, ..., xn and y1, y2, ..., yn in the order in
which they were collected.  The order statistics are denoted by x(1)
<= x(2) <= ... <= x(n) and y(1) <= y(2) <= ... <= y(n).  Then the
points in the Q-Q plot are (x(1), y(1)), (x(2), y(2)), ..., (x(n),
y(n)).

The basic reference for Q-Q plots is the paper by Wilk and Gnanadesikan (1968).

David Hoaglin

Wilk MB, Gnanadesikan R (1968). Probability plotting methods for the
analysis of data.  Biometrika 55:1-17.

On Sun, Apr 14, 2013 at 10:40 PM, æ?? æ¢¦ä½³ <[email protected]> wrote:
> Dear David,
>
> Thank you very much for the advice. I spent some time understanding the terms and interpreting the normality probability plot graph and also tried the "distribution dot plot" which seems more familiar to me. And these are really more direct to observe the skewness and kurtosis on different tails. So thank you again!
>
> With regard to the "quantile-quantile plot", I'm confused that what shall be put as the "Quantiles variable 1" and "Quantiles variable 2", since my dataset looks like the following. There are 299 Funds and year lasts from 2005 to 2010 on semi-annual base. How to understand the "pairs" you mentioned in the previous response?
>
> Fund         | Year             Week    Return  female
> -----------------+--------------------------------------------------------
> 000011.OF       200506  Week1   -.01595214       0
> â?¦â?¦..
> 000011.OF       200506    Week26 -.02965235        0
> 000011.OF       200512  Week1   -.01595214       0
> â?¦â?¦â?¦.
> 000011.OF       201012  Week26 .00202634         0
> 000021.OF       200506  Week1   .03485255        1
> â?¦â?¦â?¦â?¦
> 690003.OF       201012  Week26  .02142162        0
>
> Thank you for your time and precious advice,
>
> Mengjia

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: How to calculate kurtosis on left tail and right tail separately?
Next by Date: st: Copying structures
Previous by thread: st: drop duplicates iff
Next by thread: st: Copying structures
Index(es):
- Date
- Thread