Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Y axis values for hist ,density


From   Marcello Pagano <[email protected]>
To   [email protected]
Subject   Re: st: RE: Y axis values for hist ,density
Date   Thu, 27 Oct 2005 09:24:57 -0400

I would recommend reading the well written page

http://www.stata.com/support/faqs/graphics/histvary.html

and paying special attention to the equal probability version
(eqprhistogram); it has a lot going for it, including its
dislike for zero-height columns.

m.p.

Jann Ben wrote:

Bang! I don't agree. The purpose of a histogram is to make visible the shape of a density. It is therefore natural to report the y-axis in terms of a density. ben



-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Allan Reese (Cefas)
Sent: Thursday, October 27, 2005 3:03 PM
To: [email protected]
Subject: st: Y axis values for hist ,density


The default "hist x" command in Stata gives a Y axis labelled a density. I've never given it much attention until I saw the scale went up to 2 on a plot. Hold on, density functions sum to 1 over the variable.

Further investigation and discussion with Statacorp identified that the default tries to make the "area" of the bars add up to 1. If the number of bars changes, so does their width and so does the Y labelling. In my example, the data were discrete, so increasing the number of intervals did not change the plot except to add more zero-height columns and hence make each column narrower.

hist x, bin(n) therefore caused different Y labelling with varying n
hist x, xcale(xrange(0 n) did not affect the labelling, though the bars got narrower with bigger n
hist x, frac and hist x, discrete both gave correct labelling, and the sum of column heights was 1.
Do other users think this is perverse behaviour, especially as the default? My take is that, when drawing a histogram, the column width is taken as an arbitrary unit, not directly related to the x-scale. The implication is that you need to scale the height only when there are mixed-width columns, but would not label the Y axis in "freq/absolute-width" units. Having "densities" that vary and are in such peculiar units (1/locust in my example!) does not seem helpful.

Shoot me down
Allan


**************************************************************
*********************
This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring.
**************************************************************
*********************


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index