Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Histograms (was: Multiple (overlaid) Histogram)


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Histograms (was: Multiple (overlaid) Histogram)
Date   Thu, 29 May 2003 12:29:51 +0100

Ulrich Kohler

> Nick Cox wrote:
> > -histogram- needs no improvement. It is perfect. (No, I
> didn't write
> > it.)
> >
> > More seriously, this touches upon some issues flagged on Statalist
> > earlier this year.
> >
> > Part of the issue may be terminological, as in a
> concurrent thread.
> >
> > 1. I take a histogram, strict sense, to refer to a display of
> > frequencies
> > (fractions, densities) of a continuous variable divided
> into classes
> > (bins).
> > The hallmark of a histogram as produced by proper
> statistical software
> > is
> > that adjacent bars touch. (If this isn't true, you haven't got a
> > proper
> > histogram, or you haven't got proper statistical software.)
>
> I second Nicks strict definition of histograms. However, I
> think that Statas
> implementation of histograms is slightly to "strict".
> Histograms refer to a
> display of densities of  a continuous variable devided into
> classes. But
> there is no reason that all the classes have equal width.
> One can draw
> histograms with bins of different width and heights
> proportional to the
> densities. The area of these bins are proportional to the
> fraction, than.
>
> To my knowledge this is the original definition of
> histograms. Stata
> histograms are special cases. The more general case seem to be less
> vulnerable against the decisions about the number of bins
> and/or the origin
> than the Stata implementation. But this is just my very
> personal impression.
>
> Anyway, there is no reaseon not to allow bins with
> different widths.
> -histogram- needs improvement. It is not perfect.
>
> (Unfortunatelly, the implementation of histograms with
> different widths in
> hist3.ado is far from beeing perfect ...)

The "perfect" was tongue in cheek.

Histograms include those with unequal intervals, and Ulrich
is quite correct in pointing out that -histogram- offers
no direct support for those. For that you have to resort
to user-written programs, none I believe translated to Stata 8.

It seems to me that there are two supplementary questions:

1. Empirical. You will see histograms with unequal widths
particularly in older books and papers, and the reason was
that data for them came already grouped in such classes. There's
an example in Snedecor and Cochran's venerable text.
That seems far less common today when more and more data sets are
available in raw, ungrouped form, modulo confidentiality
constraints. I don't see people asking for this often on Statalist,
and one good reason for this being low down in priority is that
it is practice rarely needed.

2. The "slippery slope question": if unequal widths
are supported, then next in line is the question of support
for a histogram with a class which extends from large positive
number to infinity and/or a class which extends from a large
negative number to minus infinity. Even quite what
you _should_ draw then seems to me an open question
(pun intended).

Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index