Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: qnorm
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: qnorm
Date
Mon, 5 Mar 2012 16:59:40 +0000
I pushed this a bit further. This is just a helper program that calculates variables defining an envelope; it deliberately stops short of the graphics, which -qplot- (SJ) can do for you. A help file will follow in due course.
*! 1.0.0 NJC 5 March 2012
program qnormenv
version 9
syntax varname(numeric) [if] [in], GENerate(str) [ reps(int 100) level(int 95) ]
marksample touse
qui count if `touse'
if r(N) == 0 error 2000
tokenize "`generate'"
if "`2'" == "" | "`3'" != "" {
di as err "two names required in generate()"
exit 198
}
confirm new var `generate'
mata : _qnormenv("`varlist'", "`touse'", "`generate'", `reps', `level')
end
mata:
void _qnormenv(
string scalar varname,
string scalar tousename,
string rowvector newnames,
real scalar reps,
real scalar level)
{
real matrix compare
real colvector y
real scalar mean, sd, n, l1, l2, u1, u2
y = st_view(., varname, tousename)
mean = mean(y)
sd = sqrt(variance(y))
n = rows(y)
compare = J(n, 0, .)
for (j = 1; j <= reps; j++) {
compare = compare, sort(rnormal(n, 1, mean, sd), 1)
}
level1 = (100 - level) / 200
level2 = (100 + level) / 200
l1 = floor(reps * level1)
l2 = ceil(reps * level1)
u1 = floor(reps * level2)
u2 = ceil(reps * level2)
envelope = J(n, 2, .)
for (i = 1; i <= n; i++) {
x = sort(compare[i,]', 1)
envelope[i,] = ((x[l1] + x[l2])/2, (x[u1] + x[u2])/2)
}
newnames = tokens(newnames)
(void) st_addvar("float", newnames)
st_store(., newnames, tousename, envelope)
}
end
Nick
[email protected]
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: 05 March 2012 15:24
To: '[email protected]'
Subject: RE: st: qnorm
The approach in Maarten's program is to generate a number of random samples and show the lot as replicates.
Here as an alternative is some example code for individual 95% confidence intervals for each plotted point. -qplot- used at the end is from SJ. The code isn't smart about missing values, but it could easily be made smarter. I also guess the code could be shortened in the middle.
sysuse auto, clear
mata
y = sort(st_data(., "mpg"), 1)
mean = mean(y)
sd = sqrt(variance(y))
n = rows(y)
compare = J(n, 0, .)
for (j = 1; j <= 100; j++) {
compare = compare, sort(rnormal(n, 1, mean, sd), 1)
}
envelope = J(n, 2, .)
for (i = 1; i <= n; i++) {
x = sort(compare[i,]', 1)
envelope[i,] = ((x[2] + x[3])/2, (x[97] + x[98])/2)
}
names = tokens("_lower _upper")
(void) st_addvar("float", names)
st_store(., names, envelope)
end
qplot mpg _lower _upper, ms(O i i) c(. J J) legend(off) ytitle("`: var label mpg'")
Nick
[email protected]
Maarten Buis
On Mon, Mar 5, 2012 at 10:03 AM, Nick Cox wrote:
> 1. Simulate several samples from a distribution with the same mean and
> standard deviation (or more generally an appropriate mean and standard
> deviation) and use the resulting portfolio of plots in assessing what
> kind of variability is to be expected.
An easy way to do so is to use the -margdistfit- package, which you
can install by typing in Stata -ssc install margdistfit-. The default
is actually to first sample the mean and the standard deviation from
its sampling distribution and than sample a new variable with those
sampled means and standard deviation. I suspect that this makes sense
in most cases, though I also suspect that it won't matter much. If you
want to do exactly what Nick proposes you can add the -noparsamp-
option.
Here is an example of what such a graph would look like:
*-------- begin example -----------
sysuse auto, clear
reg mpg
margdistfit, qq name(qq)
margdistfit, pp name(pp)
margdistfit, hangroot name(hangr)
margdistfit, cumul name(cumul)
*--------- end example ------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/