st: Speed of bsample and nested loops
"Poliquin, Christopher" <[email protected]>
"[email protected]" <[email protected]>
Wed, 5 Oct 2011 15:51:17 -0400
I am trying to speed up my code for bootstrapping and suspect there are significant gains to be made because right now it is super slow.
I am trying to draw samples of size 1-3 with replacement from a file with about 300,000 rows. It is a panel dataset of companies and their daily stock returns for two years.
I have written a little program to loop over groups of companies and draw samples of size 1-3 from 5 different variables with returns data. The mean of the sample is then written to a file.
Could someone please look at this code and suggest areas that could be modified to make this run at a reasonable speed? I have omitted the beginning because the real issue is probably the nested loops.
program bootstrapping
// Bootstapping mean abnormal returns
// Pass sample name as first argument for saving output
// Pass replication number as second argument
egen boot_grp = group(id cl)
*[Some omitted stuff that is fast already]
// Open a file to hold the bootstrapped results.
file open boot using `1'_boots.txt, write text replace
file write boot "id" _tab "cl" _tab "sampsize" _tab "ar" _tab "mean" _n
forvalues k=1/`2' {
* This is the number of draws to make for each sample size
set seed `k'
forvalues j=1/3 {
*Draws of size 1-3
capture drop w
quietly gen w = .
// Sample with replacement, fweight in w
bsample `j', strata(id cl permno) weight(w)
foreach b of local boots {
// Mean abnormal return for the sample
// within id and cl grouping.
forvalues x = 1/5 {
// Within each abnormal return measure...
quietly summarize ar`x' if boot_grp == `b' [fweight=w]
loc mu = r(mean) * 100
// Write bootstapped means to the output file
file write boot "`idb`b''" _tab "`gb`b''" _tab
file write boot "`j'" _tab "`x'" _tab
file write boot "`mu'" _n
file close boot
Best wishes,
