Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[no subject]
For one fund, suppose we take 200506 Week 1 through 200512 Week 26 as
Variable 1 (with 52 observations) and 200606 Week 1 through 200612
Week 26 as Variable 2. We would obtain the sample quantiles for
Variable 1 by putting its 52 values in numerical order from smallest
(min1) to largest (max1). Separately putting the 52 values of
Variable 2 in order would produce min2, ..., max2. The empirical
quantile-quantile plot (Q-Q plot) is based on the 52 points (min1,
min2), ..., (max1, max2). If the two samples come from distributions
that have the same shape, the pattern in the plot will resemble a
straight line. (By "shape" I mean a family of distributions whose
members differ only in location and scale. For example, the normal
distributions are a single shape.)
In statistical language, we are working with the order statistics of
the two samples. To simplify the notation, call the variables x and
y, with values x1, x2, ..., xn and y1, y2, ..., yn in the order in
which they were collected. The order statistics are denoted by x(1)
<= x(2) <= ... <= x(n) and y(1) <= y(2) <= ... <= y(n). Then the
points in the Q-Q plot are (x(1), y(1)), (x(2), y(2)), ..., (x(n),
y(n)).
The basic reference for Q-Q plots is the paper by Wilk and Gnanadesikan (1968).
David Hoaglin
Wilk MB, Gnanadesikan R (1968). Probability plotting methods for the
analysis of data. Biometrika 55:1-17.
On Sun, Apr 14, 2013 at 10:40 PM, æ?? 梦佳 <[email protected]> wrote:
> Dear David,
>
> Thank you very much for the advice. I spent some time understanding the terms and interpreting the normality probability plot graph and also tried the "distribution dot plot" which seems more familiar to me. And these are really more direct to observe the skewness and kurtosis on different tails. So thank you again!
>
> With regard to the "quantile-quantile plot", I'm confused that what shall be put as the "Quantiles variable 1" and "Quantiles variable 2", since my dataset looks like the following. There are 299 Funds and year lasts from 2005 to 2010 on semi-annual base. How to understand the "pairs" you mentioned in the previous response?
>
> Fund | Year Week Return female
> -----------------+--------------------------------------------------------
> 000011.OF 200506 Week1 -.01595214 0
> ��..
> 000011.OF 200506 Week26 -.02965235 0
> 000011.OF 200512 Week1 -.01595214 0
> ���.
> 000011.OF 201012 Week26 .00202634 0
> 000021.OF 200506 Week1 .03485255 1
> ����
> 690003.OF 201012 Week26 .02142162 0
>
> Thank you for your time and precious advice,
>
> Mengjia
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/