Jeff, Thanks for the reply. Your suggestion worked and I added the
strata option to the bootstrap command. I am assuming that this was
your intention.
Thanks again,
Michael
>>> Jeff Pitblado 7/15/2004 1:24:58 PM >>>
Sorry for the repost; I forgot the subject line in the previous post.
Michael Malette <[email protected]> asks how -bootstrap-
resamples
the data:
> I have a question about the bootstrapping command. I'm trying to
> compare 2 means from populations with different sample sizes (n=400,
> n=3000) using a t-test. The bootstrapping command should shuffle
data
> and generate t-values after each shuffle. This leads me to my
question,
> how does stata shuffle the data.
>
> Does it take an equal subsample from each population (n=200 of the
400
> and n=200 of the 3000) and calculate t or does it take an overall
> subsample with an unequal number from each population (n=100 of 400
and
> n=300 of 3000)?
>
> This is the syntax that we are using:
> program define TTestBoot
> version 8.2
> args AHI FWIN
> ttest AHI == FWIN, unpaired
> end
>
> use "H:\ahifwin.dta", clear
> bootstrap "TTestBoot AHI FWIN" T=r(t), reps(1000) saving
> ("H:\work\test.dta")
To get bootstrap to sample independently between two (or more) groups,
use the
-strata()- option. It seems that Michael's dataset is in wide format,
so
Michael will have to use -reshape- and alittle data management to get
a
dataset that will work with -bootstrap- and the -strata()- option.
Here is a
short example:
***** BEGIN: example.do
clear
// some example data in wide format, notice that the groups are
unbalanced
input x y
3 6
13 3
12 9
13 15
14 18
3 10
13 9
2 18
12 2
18 14
. 15
. 14
. 5
. 2
. 12
. 8
. 17
. 3
. 12
. 3
end
// call to -ttest- using originally shaped data
ttest x == y, unpaired
// rename the variables so they can be used with -reshape-
rename x x1
rename y x2
gen obsid = _n
// use reshape to stack the values in x1 and x2 into a new variable x
reshape long x, i(obsid) j(group)
// drop the missing values that made the original data unbalanced
drop if missing(x)
// two sample -ttest- using the stacked data, verify that the results
are the
// same as the above -ttest- results
ttest x, by(group)
// bootstrap the t-statistic from the two sample -ttest-
bootstrap "ttest x, by(group)" T=r(t), reps(100) sav(test.dta) replace
dots
exit
***** END: example.do
--Jeff
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/