David Airey replied to Jeff Pitblado
>
> > David makes a good point for removing the option for
> -histogram-, and
> > we will
> > remove the checkboxes for log scales from the easy graph
> dialog for
> > -histogram-.
> >
> > However, the "full featured" dialog for -histogram- will
> remain the
> > same since
> > both -xscale(log)- and -yscale(log)- are valid -graph
> twoway- options.
> >
>
> This reply beats around the bush and doesn't explain to me why
> "histogram mpg, xscale(log)" would ever make sense. That is
> really my
> question; I'm ignorant of the answer.
>
> Jeff's answer that histogram's option xscale is valid
> because it is a
> twoway explains why histogram inherits these options. That
> doesn't mean
> the design choice is a good one. The area under the
> histogram should
> sum to 1 when the yaxis is density, as stated in the
> manual. I don't
> think when xscale(log) is used it will (though I have not
> measured). If
> it doesn't, then the option is pointless (the option is not
> pointless
> for other graph twoway commands like scatter).
>
> Perhaps not all daughters of the mother twoway should
> inherit certain
> twoway options?
This in turn touches on various tricky design issues, one
being how far statistical software designers (a) should
and (b) can decide ex cathedra which kinds of graph
are inadmissible or inappropriate, especially when what
may seem crazy in one field may turn out to have
a specific rationale in another. Excellence comes
easily to Stata Corp, but omniscience is an asymptotic
property.
I don't know a strong case for binning on the original scale,
yet showing the results with -xscale(log)-. However, blowing
up the left-hand part of the scale like this
might have some private use for examining fine structure.
For example, I have worked with glacier area data which
tend to be very heavily skewed and problematic at the lower end.
Among other issues, it can be difficult to distinguish, especially
without a field visit, between a true glacier and an inert body,
and different scientists compiling area data (usually in some
national agency office) tacitly show different degrees
of scepticism in distinguishing glaciers and non-glaciers.
For such a problem, graphs of the kind discussed might have some
private value, as there is often merit in a scale which uses the
units familiar to researchers. I wouldn't publish such
a histogram myself, but it might be of some use.
More generally, the principle that the area under the histogram
should integrate to 1 -- or to the number of values --
is clearly a good one. However, it is not the only criterion.
Plotting log frequency vs log magnitude is
common in sedimentology. R.A. Bagnold did this
in his classic book on wind-blown sand in 1941
and appropriate hyperbolic distributions have since been
investigated by O. Barndorff-Nielsen and others. Those
ideas appear to be drifting into other areas such as
financial modelling.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/