On 3/6/07, Austin Nichols <[email protected]> wrote:
> The Pareto distribution is typically defined by the cdf F(x;a) = 1 -
> x^(-a) where a>0 for x>=0 and zero elsewhere, and the pdf f(x;a) =
> ax^(-a-1) for x>=0 and zero elsewhere. A version with two parameters
> is given by F(x;a,k) = 1-(x/k)^(-a) and f(x; a,k) = (a/k)(x/k)^(-a-1)
> = a(k)^(a)(x)^(-a-1).
>
> On a log-log plot, the density function for the Pareto distribution is
> a straight line:
> ln f(x) = (−a − 1) ln x + a ln k + ln a.
>
> This suggests a means for estimating parameters a and k by
> constructing kernel density estimates of f(x), and regressing
> ln(\hat{f(x)}) on ln(x). Standard errors could presumably be obtained
> via bootstrap.
well I guess it would be easier to take ln(1-F) = -a ln x + a ln k
which is directly estimable by the standard linear regression...
possibly with heteroskedastic standard errors if one wished :)). Note
that the regularity conditions are not satisfied for k, so its
estimate is likely to be quirky. Add ln^2 x if you wish to that
regression to test for Pareto-ness of the distribution.
To test for log-normality, you can construct a Q-Q plot in logs and
see if it conforms to the normal distribution.
--
Stas Kolenikov
http://stas.kolenikov.name
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/