Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Speed of bsample and nested loops

From	philippe van kerm <[email protected]>
To	"[email protected]" <[email protected]>
Subject	RE: st: Speed of bsample and nested loops
Date	Fri, 7 Oct 2011 15:07:16 +0000

I would suggest -post- instead of -file- for that sort of work. Not sure you would observe significant spped improvements, however.

Philippe


> -----Message d'origine-----
> De : [email protected] [mailto:owner-
> [email protected]] De la part de Richard Herron
> Envoyé : Friday, October 07, 2011 4:26 PM
> À : [email protected]
> Objet : Re: st: Speed of bsample and nested loops
> 
> I don't know the inner workings of -file write-, but would you have
> any gain from replacing three calls with one?
> 
>                                        file write boot "`idb`b''" _tab
> "`gb`b''" _tab
>                                        file write boot "`j'" _tab "`x'"
> _tab
>                                        file write boot "`mu'" _n
> 
> becomes
> 
>                                        file write boot "`idb`b''" _tab
> "`gb`b''" _tab ///
>                                                            "`j'" _tab
> "`x'" _tab ///
>                                                            "`mu'" _n
> 
> It isn't clear to me from the help file if -file -open- leaves the
> text connection open, or it just performs from checks and assigns a
> handle.
> 
> On Wed, Oct 5, 2011 at 15:51, Poliquin, Christopher <[email protected]>
> wrote:
> > Hi,
> >
> > I am trying to speed up my code for bootstrapping and suspect there
> are significant gains to be made because right now it is super slow.
> >
> > I am trying to draw samples of size 1-3 with replacement from a file
> with about 300,000 rows.  It is a panel dataset of companies and their
> daily stock returns for two years.
> >
> > I have written a little program to loop over groups of companies and
> draw samples of size 1-3 from 5 different variables with returns data.
>  The mean of the sample is then written to a file.
> >
> > Could someone please look at this code and suggest areas that could
> be modified to make this run at a reasonable speed?  I have omitted the
> beginning because the real issue is probably the nested loops.
> >
> > program bootstrapping
> >        // Bootstapping mean abnormal returns
> >        // Pass sample name as first argument for saving output
> >        // Pass replication number as second argument
> >
> >        egen boot_grp = group(id cl)
> >        *[Some omitted stuff that is fast already]
> >
> >        // Open a file to hold the bootstrapped results.
> >        file open boot using `1'_boots.txt, write text replace
> >        file write boot "id" _tab "cl" _tab "sampsize" _tab "ar" _tab
> "mean" _n
> >        forvalues k=1/`2' {
> >                * This is the number of draws to make for each sample
> size
> >                set seed `k'
> >                forvalues j=1/3 {
> >                        *Draws of size 1-3
> >                        capture drop w
> >                        quietly gen w = .
> >                        // Sample with replacement, fweight in w
> >                        bsample `j', strata(id cl permno) weight(w)
> >                        foreach b of local boots {
> >                                // Mean abnormal return for the sample
> >                                // within id and cl grouping.
> >                                forvalues x = 1/5 {
> >                                        // Within each abnormal return
> measure...
> >                                        quietly summarize ar`x' if
> boot_grp == `b' [fweight=w]
> >                                        loc mu = r(mean) * 100
> >                                        // Write bootstapped means to
> the output file
> >                                        file write boot "`idb`b''"
> _tab "`gb`b''" _tab
> >                                        file write boot "`j'" _tab
> "`x'" _tab
> >                                        file write boot "`mu'" _n
> >                                }
> >                        }
> >                }
> >        }
> >        file close boot
> > end
> >
> >
> > Best wishes,
> > Chris
> >
> >
> >
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> >
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Speed of bsample and nested loops
  - From: "Poliquin, Christopher" <[email protected]>
- Re: st: Speed of bsample and nested loops
  - From: Richard Herron <[email protected]>

Prev by Date: st: rsquare Command
Next by Date: st: forecasting y from a differenced arima model
Previous by thread: Re: st: Speed of bsample and nested loops
Next by thread: st: MATA all combinations / pairs of a row
Index(es):
- Date
- Thread