Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: multiple histograms combined into one?


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: RE: multiple histograms combined into one?
Date   Fri, 12 Jan 2007 16:35:38 -0000

That said, this kind of histogram is in my opinion statistically
questionable. I'm pretty clear we wouldn't allow it in the 
Stata Journal! 

It compounds the arbitrary binning and origin
that are the worst features of histograms by splitting 
each distribution into _separate_ bins. The idea that 
the reader can perceive both the component distributions 
as Gestalts and compare fine structure seems far fetched to me. 

To compare two distributions, consider quantile-quantile plots
(-qqplot-), superimposed density estimates (-kdensity-, -twoway
kdensity-), superimposed quantile functions (-qplot- from SJ), 
superimposed distribution functions (-distplot- from SJ), 
etc., etc. 

Nick 
[email protected] 

Scott Merryman
 
> Try this:
> 
> Use -twoway__histogram_gen- to generate the height and bin 
> locations for
> each variable, -merge- them together so the bin locations 
> line up, and call
> -graph bar- to produce the graph.
> 
> clear
> set obs 50
> set seed 12345
> gen x1 = invnorm(uniform())
> gen x2 = invnorm(uniform())*2
> gen x3 = invnorm(uniform()) + 3
> 
> preserve
> local min = .
> foreach var of varlist x* {
> 	sum `var'
> 	if floor(`=r(min)') < `min' {
> 	local min = floor(r(min))
> 	}
> }
> tempfile foo
> save `foo'.dta
> forv i = 1/3 {
> 	use `foo'.dta
> 	twoway__histogram_gen x`i', display  gen(h`i' p`i') start(`min')
> width(1)
> 	tempfile foo`i'
> 	sort p`i'
> 	drop if p`i' ==.
> 	rename p`i' p
> 	keep p h`i'
> 	save `foo`i''.dta
> }
> 
> use `foo1'.dta
> forv i = 2/3 {
> 	merge p using `foo`i''.dta
> 	drop _m
> 	sort p
> }
> l 
> graph bar h*, over(p) bar(1, lcolor(black) fcolor(gs15)) /// 
>   bar(2, lcolor(black) fcolor(gs11)) bar(3, lcolor(black) fcolor(gs6))
> legend(off)
> restore

Justin Gengler
 
> > I would like to produce a SINGLE histogram (unlike the two
> > split-screen histograms produced using the 'by(...)' argument) that
> > combines two histograms of some variable 'x' (i.e., x given some
> > value of some other variable 'z') -- for example, a single histogram
> > combining two individual histograms for some variable when sex == 0
> > and when sex == 1.  Thus what I am looking for is akin to a bar plot
> > with the over(...) argument specified, but of course with the output
> > being a histogram rather than a bar plot.
> > 
> > For an illustration of what I am looking for, see this graph:
> > http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=82.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index