Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: CDF plot with normal probability axis
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: CDF plot with normal probability axis
Date
Thu, 14 Nov 2013 09:21:20 +0000
-distplot- (SJ), -cdfplot- (STB originally, SSC now): as always,
please explain the origin of the user-written commands you refer to.
-qplot- (SJ) can do this, roughly.
. sysuse auto
(1978 Automobile Data)
. qplot turn trunk, trscale(invnormal(@))
. qplot turn trunk, trscale(invnormal(@)) xtitle(standard normal
deviate) xla(-2/2)
The axes are the other way round from what you ask; I'd argue that is
better practice, or at least consistent with -qnorm-. (-ysc(log)- is
also possible.)
Note that you should not expect cumulative distribution plots to do
this by default as they usually plot cumulative probabilities as 1/n,
..., n/n and -invormal(n/n)- is -invnormal(1)- and as such
indeteminate.
But it is as easy to do this pretty much from first principles. See e.g.
http://www.stata.com/support/faqs/statistics/percentile-ranks-and-plotting-positions/index.html
http://www.stata-journal.com/sjpdf.html?articlenum=gr0027
http://www.stata-journal.com/sjpdf.html?articlenum=gr0032
I will cheat slightly and use -mylabels- (SSC).
Here is some code. Any number of possible small variations should be evident.
sysuse auto, clear
replace price = price/1000
foreach v in price mpg {
egen y`v' = rank(`v')
su `v', meanonly
replace y`v' = invnormal((y`v' - 0.5) / r(N))
label var y`v' "`: var label `v''"
}
mylabels 1 5 10(10)90 95 99, myscale(invnormal(@/100)) local(labels)
twoway connect yprice price, ms(Dh) sort || ///
connect ympg mpg, sort ms(Th) xsc(log) yla(`labels', ang(h)) xla(5 10 20 40) ///
ytitle(Cumulative percent)
Nick
[email protected]
On 14 November 2013 02:09, Livingston, Michael (TP)
<[email protected]> wrote:
> I'm trying to create simple plot to compare two distributions - I want the logged values across the x-axis and then the cumulative probability on the y-axis, but with a normal probability y-axis (like the one here: http://lowrank.net/gnuplot/plot7-e.html).
>
> It seems like it should be really simple, but I haven't come with any solutions using distplot or cdfplot. Is there something obvious I'm missing?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/