Christopher W. Ryan
>
> Error bars on bar charts I understand. And maybe
> "histogram" is being used
> loosely here. But why would you want an error bar on a
> histogram? Doesn't a
> histogram simply show how many observations, in the sample,
> are in each of
> several continguous range categories of a continuous
> variable? There's no
> "error" to that, is there?
>
> Just trying to understand these concepts myself.
It's a perfectly sensible idea, although perhaps
there are better ones.
Imagine lots of samples divided into lots of bins:
the bin counts (or fractions, or densities) will typically
change from sample to sample. Each is a sample statistic
just like any other.
A very crude first stab is to guess at a Poisson model
for each bin count. That leads to an idea that errors are
approximately constant on a square root scale, which is
why Tukey proposed his "rootogram" circa 1965. Despite his
typically energetic emphasis on similar ideas in his
"Exploratory data analysis" (1977) they never really caught
on. However, an implementation in Stata -- of showing root of
count, not error bars -- is available through a -spikeplot-
option.
In practice, however,
1. Part of the art of histogram design is to choose
a width on which what may be sampling variation is suppressed,
so that this is not an issue. Usually this is done informally
rather than formally.
2. If variables are continuous, then histograms are at best
a means to an end and it is a better idea to get an estimate
of the density function.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/