Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: distribution test
From
Maarten Buis <[email protected]>
To
[email protected]
Subject
Re: st: distribution test
Date
Tue, 30 Aug 2011 12:19:55 +0200
On Tue, Aug 30, 2011 at 11:13 AM, Nick Cox wrote:
> There is a direct method to check for fit to an exponential
> distribution: a quantile-quantile plot. See -qexp- (SSC) and/or
>
> SJ-7-2 gr0027 . . Stata tip 47: Quantile-quantile plots without programming
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
> Q2/07 SJ 7(2):275--279 (no commands)
> tip on producing various quantile-quantile (Q-Q) plots
>
> on how to do it for yourself. The latter is directly accessible at
I like to use estimated coefficients for computing the values on the
x-axis, as that way the graph should be square and the reference line
is the 45 degree line. Moreover,I like to supplement such a
quantile-quantile plot with several random draws from the assumed
distribution. This gives me an idea of how much deviation from the
theoretical distribution I might reasonably expect. Below is an
example of how I would do that.
*--------------------- begin example --------------------
// create some expnential data
local lambda = 2
drop _all
set obs 500
gen y = -1/`lambda'*ln(1-runiform())
label var y "observed"
// estimate parameter
sum y, meanonly
local lambdahat = 1/r(mean)
di as txt "ML estimate of lambda is: " ///
as result `lambdahat'
// will use that later for computing the range of the graph
local max = r(max)
// As discussed in: Nicholas J. Cox (2007) Stata tip 47:
// Quantile-quantile plots without programming. The Stata
// Journal, 7(2): 275--279.
egen rank = rank(y)
egen n = count(y)
gen pp = (rank - 0.5) / n
gen exponential = -1/`lambdahat'*ln(1 - pp)
// will use that to compute the range of the reference line
sum exponential, meanonly
local reflinerange "range(0 `r(max)')"
// create 20 random variables assuming the model is correct
forvalues i = 1/20 {
gen y`i' = -1/`lambdahat'*ln(1-runiform())
egen rank`i' = rank(y`i')
egen n`i' = count(y`i')
gen pp`i' = (rank`i' - 0.5) / n`i'
gen exponential`i' = -1/`lambdahat'*ln(1 - pp`i')
drop rank`i' n`i' pp`i'
#delim ;
local gr `"`gr' || line y`i' exponential`i',
sort lstyle(solid) lcolor(gs12)"' ;
#delim cr
sum y`i', meanonly
local max = max(`r(max)', `max')
}
// compute nice axis labels
_natscale 0 `max' 5
local lab "lab(`r(min)'(`r(delta)')`r(max)')"
// make sure the graph is square
local range "scale(range(0 `max'))"
// use var label for y-axis title when present
if `"`: var label y'"' != "" {
local ytitle `"ytitle(`"`: var label y'"')"'
}
else {
local ytitle `"ytitle(y)"'
}
// create the graph
twoway `gr' || ///
scatter y exponential, ///
y`lab' x`lab' y`range' x`range' ///
aspect(1) msymbol(oh) || ///
function reference = x, ///
`reflinerange' lstyle(solid) ///
legend(order( 1 "samples" 21 22 )) ///
xtitle(exponential distribution) ///
`ytitle'
*---------------------- end example ---------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
Hope this helps,
Maarten
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/