Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: on bootstrap


From   "Rubil Ivica" <[email protected]>
To   <[email protected]>
Subject   st: RE: on bootstrap
Date   Thu, 27 Dec 2012 10:48:10 +0100

One more thing: should I use the option strata(dummy1) in the second option (see below)?

--
Ivica Rubil
Ekonomski institut || The Institute of Economics, Zagreb
Trg J. F. Kennedyja 7, 10 000 Zagreb, Croatia
tel. +385-1-2362-269 || fax. +385-1-2335-165
[email protected] || www.eizg.hr


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Rubil Ivica
Sent: 27. prosinac 2012 9:59
To: [email protected]
Subject: st: on bootstrap

Dear Statalisters,

I have a question on bootstrap. I have cross-sectional data on incomes for two countries, y0 and y1. I would like to obtain bootstrap standard errors for the ratio of medians of these two income distributions. Two ways come to my mind but I am not sure which one would be more appropriate:

Option 1:
I create dataset with variables y0 and y1. Since these two distributions are for different countries, basically it does not matter how the two are sorted, all combinations are possible. However, the way they are sorted matters for bootstraping, since different sortings imply different pairs (y0, y1) for each "observation". Then, of course, I get different bootstrap results for different sortings.
The code I use is the following:

cap prog drop medratio
prog medratio, rclass
	qui sum y0 
	scalar med0 = r(p50)
	qui sum y1  
	scalar med1 = r(p50)
	return scalar med_ratio = med1 / med0
end
bootstrap r(med_ratio), seed(1234) reps(500): medratio



Option 2:
I append y1 to y0 and get one income variable, y. In addition, I create a dummy1 = 1 for incomes from country 1. And then I do the bootstrap using the following code:

cap prog drop medratio1
prog medratio1, rclass
	qui sum y if dummy1 == 1 
	scalar med1 = r(p50)
	qui sum y   if dummy1 == 0
	scalar med0 = r(p50)
	return scalar med_ratio = med1 / med0
end
bootstrap r(med_ratio), seed(1234) reps(500): medratio1


So, which of these two options seems more appropriate?

Thanks.


--
Ivica Rubil
Ekonomski institut || The Institute of Economics, Zagreb
Trg J. F. Kennedyja 7, 10 000 Zagreb, Croatia
tel. +385-1-2362-269 || fax. +385-1-2335-165
[email protected] || www.eizg.hr


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index