| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Construct Null Datasets through Bootstrap Resampling
From |
"Michael Blasnik" <[email protected]> |
To |
<[email protected]> |
Subject |
Re: st: Construct Null Datasets through Bootstrap Resampling |
Date |
Fri, 01 Dec 2006 17:52:08 -0500 |
Sure- here's a quick way to scramble the observations while keeping groups
of variables together. It will work with up to four groups. It's not the
most elegant code, but it works (I think) . The way to specify the groups is
to enclose the variable name list in double quotes.
program define scramblegrp
version 9.2
args grp1 grp2 grp3 grp4
tempvar hold order rand
qui gen `hold'=.
gen long `order'=_n
foreach grp in "`grp1'" "`grp2'" "`grp3'" "`grp4'" {
qui gen `rand'=uniform()
sort `rand'
foreach var of local grp {
qui replace `hold'=`var'
qui replace `var'=`hold'[`order']
}
drop `rand'
}
end
Here's a quick test using two groups:
sysuse auto
scramblegrp "price mpg" "weight length turn"
Michael Blasnik
----- Original Message -----
From: "Erik Ingelsson" <[email protected]>
To: <[email protected]>
Sent: Friday, December 01, 2006 5:17 PM
Subject: Re: st: Construct Null Datasets through Bootstrap Resampling
Thanks Michael,
After discussing this with our senior statistician today, I will probably
go with the replacement, since this is how they have done it before in
SAS. However, do you think that you could easily explain for me how to
keep groups together in the scramble code as well? Just if I need to do
it that way on a later occasion?
Best,
Erik Ingelsson
Quoting Michael Blasnik <[email protected]>:
One more difference between the approaches that you should recognize is
that bsample samples with replacement (a value from a given observation
can appear more than once) while the scramble program I wrote does not
-- it simply re-arranges the order. I'm not sure which is preferred
for your application.
If you end up wanting to sample without replacement (as scramble does)
but want to keep groups of variables together, a relatively modest
change to the scramble code would do the trick.
Michael Blasnik
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/