Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: on bootstrap
From
"Rubil Ivica" <[email protected]>
To
<[email protected]>
Subject
st: RE: on bootstrap
Date
Thu, 27 Dec 2012 10:48:10 +0100
One more thing: should I use the option strata(dummy1) in the second option (see below)?
--
Ivica Rubil
Ekonomski institut || The Institute of Economics, Zagreb
Trg J. F. Kennedyja 7, 10 000 Zagreb, Croatia
tel. +385-1-2362-269 || fax. +385-1-2335-165
[email protected] || www.eizg.hr
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Rubil Ivica
Sent: 27. prosinac 2012 9:59
To: [email protected]
Subject: st: on bootstrap
Dear Statalisters,
I have a question on bootstrap. I have cross-sectional data on incomes for two countries, y0 and y1. I would like to obtain bootstrap standard errors for the ratio of medians of these two income distributions. Two ways come to my mind but I am not sure which one would be more appropriate:
Option 1:
I create dataset with variables y0 and y1. Since these two distributions are for different countries, basically it does not matter how the two are sorted, all combinations are possible. However, the way they are sorted matters for bootstraping, since different sortings imply different pairs (y0, y1) for each "observation". Then, of course, I get different bootstrap results for different sortings.
The code I use is the following:
cap prog drop medratio
prog medratio, rclass
qui sum y0
scalar med0 = r(p50)
qui sum y1
scalar med1 = r(p50)
return scalar med_ratio = med1 / med0
end
bootstrap r(med_ratio), seed(1234) reps(500): medratio
Option 2:
I append y1 to y0 and get one income variable, y. In addition, I create a dummy1 = 1 for incomes from country 1. And then I do the bootstrap using the following code:
cap prog drop medratio1
prog medratio1, rclass
qui sum y if dummy1 == 1
scalar med1 = r(p50)
qui sum y if dummy1 == 0
scalar med0 = r(p50)
return scalar med_ratio = med1 / med0
end
bootstrap r(med_ratio), seed(1234) reps(500): medratio1
So, which of these two options seems more appropriate?
Thanks.
--
Ivica Rubil
Ekonomski institut || The Institute of Economics, Zagreb
Trg J. F. Kennedyja 7, 10 000 Zagreb, Croatia
tel. +385-1-2362-269 || fax. +385-1-2335-165
[email protected] || www.eizg.hr
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/