Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Bootstrapping question
From
Maarten Buis <[email protected]>
To
[email protected]
Subject
Re: st: Bootstrapping question
Date
Fri, 8 Feb 2013 09:48:48 +0100
On Thu, Feb 7, 2013 at 10:28 PM, Ilian, Henry (ACS) wrote:
> I looked at the table of contents. The book is clearly worth having, but it doesn't seem to cover the sample-size problem--which actually may not be a problem, since the sample size is what it is, and there isn't a way to make it any larger. By improved, I meant narrower, although that's such an obvious answer I don't think it was what you were asking me. If bootstrapping won't result in narrower confidence intervals, then I'll have to live with the confidence intervals as they are.
It is not obvious that smaller confidence intervals represent an
improvement. A confidence interval is based on a thought experiment:
what if I could draw many new sample of the same size from my
population and compute my statistic in each of these samples. Each of
these statistics would be slightly different, as they are based on a
different random sample from the population. The 95% confidence
interval is an estimate of the interval within which 95% of these
hypothetical statistics will be. This is an estimate of the
uncertainty you have about your estimate, and the source of that
uncertainty is the fact that you don't have the entire population but
only a sample from that population. If you are unhappy about the size
of that interval than the obvious way to reduce that is to increase
the sample size. There are other cute ways of improving the precision
of your estimate, e.g. stratified sampling, but don't expect too much
from that: there is no way around the fact that a sample size of 27 is
small and any estimate based on that sample size will be uncertain.
If you say "improving the confidence interval", than that would mean
to me making sure that the probability that the statistic computed on
a random draw from the population falls within the 95% confidence
interval is indeed 95%. This may seem trivial, but for many estimates
of the confidence intervals this is not strictly true. Some confidence
intervals are based on a computation that assumes an infinitely large
sample and than the question becomes how large does the sample has to
be before this approximation becomes reasonable. Improving the
confidence interval would in that case mean some sort of adjustment
that takes into account that you have a sample of finite size (which
would typically increase the confidence interval rather than decrease
it). For other problems the problem of computing confidence intervals
is just very very hard and all existing estimates are approximate. The
estimate of a proportion is a good example of that: if we have N
observations, that our estimate of the proportion can only take one of
N+1 possible values: 0/N, 1/N, 2/N, ..., or N/N. This discreteness
makes the computation of an interval with exactly 95% coverage very
hard. Paradoxically, the estimates of that interval that are called
"exact" have far worse coverage than many approximate methods.
The bootstrap confidence intervals can be said to be better at dealing
with small samples in the sense that it tends to make fewer
assumptions. It is not better in the sense that it will lead to
smaller confidence intervals, that might or might not be the case
depending on the type of violation of assumptions in the method with
which you compared the bootstrap estimate.
Hope this helps,
Maarten
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/