Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Adding normal density to overlayed histograms
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: Adding normal density to overlayed histograms
Date
Thu, 21 Oct 2010 13:09:22 +0100
Michael Mitchell and Ulrich Kohler explained what is going on in Stata terms and gave excellent and essentially identical solutions to the problem posed. Here I broaden the discussion.
A histogram has some advantages and some disadvantages. This list is a personal take and naturally not intended to be definitive or complete:
+1. It is likely to seem familiar to analyst and audience.
+2. People can focus on modes, left and right tails.
-1. One histogram can easily occlude part of the other, unless you do a lot of work.
-2. More generally, the result can easily look a bit of a mess.
-3. Histograms depend on choices about bin width and bin starts, even if those choices are automated; such choices can be hard to optimise.
-4. Linked to that, you can lose detail that might be important.
-5. If the normal is a reference, the comparison is of a curve with a set of bars, which is not the easiest comparison to get right. (Sometimes, the graph is a propaganda graph presented in the spirit "Look, it's roughly normal", when a more critical look would show important features, such as heavier tails or a mild outlier.)
Now, in terms of alternatives:
I mention first -histogram, by() normal- which eases some of the problems.
A very different approach is to use quantile-quantile plots. Stata's own -qnorm- is very limited (one variable, one group), but it is easy enough
(a) to do it yourself or
(b) to exploit user-written programs.
On (a), see
SJ-7-2 gr0027 . . Stata tip 47: Quantile-quantile plots without programming
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q2/07 SJ 7(2):275--279 (no commands)
tip on producing various quantile-quantile (Q-Q) plots
The .pdf of that short paper is accessible to all via
http://www.stata-journal.com/sjpdf.html?articlenum=gr0027
so I'll not repeat the exposition, other than to underline that the first worked example is precisely that raised in this posting, two groups and whether they are normally distributed.
On (b), -qplot- offers one-liners such as
. qplot mpg, over(foreign) trscale(invnormal(@))
-search qplot, sj- for publications and download sources.
Nick
[email protected]
Dorothy Bridges
I am overlaying two histograms and would like Stata to add a normal
density curve for each.
hist x, normal addplot(hist x2)
works fine, but
hist x, normal addplot(hist x2, normal)
tells me that normal is not an option. Any ideas as to why this is happening?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/