Re: st: bs problems

From   [email protected] (Jeff Pitblado, Stata Corp.)
To   [email protected]
Subject   Re: st: bs problems
Date   Tue, 22 Jul 2003 19:29:38 -0500

Jun Xu <[email protected]> asks why sometimes the observed number of
replications is less than the requested:

> Thanks for Stata Corp people (Jeff) responding to my problem. I would be 
> interested in knowing when the update will be realized.

The fix to -bootstrap- is in the wings for the next ado-update.

> One more problem, why the reps(30) I specified is not consistent with the
> number under the Reps column (22) (as well as the matrix e(reps)? Also,
> though the ereturn can be modified to have the e(size) something like that,
> is that possible you could add the size return matrix too for next update?
> Really appreciate.

I'll answer the second question first.  Jun Xu can save the estimation sample
size by including it on the list of expressions to bootstrap.  For example,

	. bootstrap "logit ..." _b size=e(N), ...

There are two reasons why the observed number of replications for a given
bootstrapped statistic may be different (less than) the number of requested

	1.  The expression cannot be calculated after the command is executed
	    using some of the bootstrap data sets.

	2.  The command failed for some of the bootstrap data sets.

I believe it is the second that Jun Xu is observing.  Some of the logistic
regressions are failing because -bootstrap- is supplying -logit- with a
dependent variable that is either all 0's or all 1's.

The following code will reproduce this behavior:

	. sysuse auto, clear
	. keep in -25/l
	. tabulate for
	. set seed 1234
	. bootstrap "logit for mpg" _b size=e(N), reps(30) size(20)

Using the auto data, I remove all but the last 25 observations.  The
-tabulate- command shows that only 3 out of the 25 cars are Domestic.  It is
probable that a bootstrap sample from this data set will result in only
Foreign cars, even more so if we only randomly sample 20 of the 25 cars.  The
-noisily- option will display the output from -logit- for each bootstrap
sample.  When this option is given, -bootstrap- will also output a message
indicating that is will be posting missing values when either (1) or (2) above

A log from the above commands follow:

***** BEGIN: log
. sysuse auto, clear
(1978 Automobile Data)

. keep in -25/l
(49 observations deleted)

. tab for

   Car type |      Freq.     Percent        Cum.
   Domestic |          3       12.00       12.00
    Foreign |         22       88.00      100.00
      Total |         25      100.00

. set seed 1234

. bootstrap "qui logit for mpg" _b size=e(N), reps(30) size(20) noi

bootstrap: First call to (qui logit for mpg) with data as is:

. qui logit for mpg

bootstrap header:

command:      qui logit for mpg
statistics:   b_mpg      = _b[mpg]
              b_cons     = _b[_cons]
              size       = e(N)

30 calls to (qui logit for mpg) with bootstrap samples:

. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg

Bootstrap statistics                              Number of obs    =        25
                                                  Replications     =        30

Variable     |  Reps  Observed      Bias  Std. Err. [95% Conf. Interval]
       b_mpg |    25  .1438087  .1078638   .198525  -.2659268   .5535442   (N)
             |                                       .0174609    .890745   (P)
             |                                       .0174609   .5159226  (BC)
      b_cons |    25 -1.241594  -2.21652  4.356838  -10.23366   7.750477   (N)
             |                                      -18.76268   1.817488   (P)
             |                                      -8.075786   1.817488  (BC)
        size |    25        25        -5         0         25         25   (N)
             |                                             20         20   (P)
             |                                              .          .  (BC)
Note:  N   = normal
       P   = percentile
       BC  = bias-corrected

***** END: log

[email protected]
