Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: CDF and distplot
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: CDF and distplot
Date
Wed, 17 Jul 2013 23:20:51 +0100
Here's some technique independently of -distplot-.
sysuse auto.dta, clear
gen thousand = floor(price/1000)
drop if foreign & thousand>9
bysort foreign : cumul thousand, gen(cumulative) equal
set obs 75
replace foreign = 1 in L
replace thousand = 15 in L
replace cumul = 1 in L
twoway connected cumul thousand if !foreign, sort || ///
connected cumul thousand if foreign, sort legend(order(1 "Domestic" 2
"Foreign"))
Nick
[email protected]
On 17 July 2013 21:47, Nick Cox <[email protected]> wrote:
> -distplot- (SJ) is mine.
>
> You want it to show observations that don't exist, but with zero frequency.
>
> I have to say that I don't think it is a problem that -distplot- does
> not do this, so I have no "solution" to offer.
>
>
> Nick
> [email protected]
>
>
> On 17 July 2013 21:30, Eric B. <[email protected]> wrote:
>> Dear Statalist users,
>>
>> I need help plotting a cumulative distribution function. After many attempts
>> with other commands, I am using the user-written - distplot - as it allows
>> flexibility in terms of configuring the look of the lines and graph.
>>
>> The variable I want to plot assumes integer values between 0 and 20. I need
>> a cdf that plots two lines in the same graph by gender. While there are
>> observations from 0 to 20 in the female sub-group, there are not
>> observations higher than 10 for the male subgroup. The cdf line for the male
>> subgroup reaches 100% at variable-value 10 and the line stops there. The cdf
>> for the female subgroup continues up to 20. I would like both lines to
>> continue through to 20, since is the range of the original data.
>>
>> In case I was not too clear, I have adapted the auto.dta data to show this
>> issue and made the data on prices as an integer:
>>
>> sysuse auto.dta, clear
>> gen thousand=3 if price < 4000
>> replace thousand=4 if price < 5000 & price >=4000
>> replace thousand=5 if price < 6000 & price >=5000
>> replace thousand=6 if price < 7000 & price >=6000
>> replace thousand=7 if price < 8000 & price >=7000
>> replace thousand=8 if price < 9000 & price >=8000
>> replace thousand=9 if price < 10000 & price >=9000
>> replace thousand=10 if price < 11000 & price >=10000
>> replace thousand=11 if price < 12000 & price >=11000
>> replace thousand=12 if price < 13000 & price >=12000
>> replace thousand=13 if price < 14000 & price >=13000
>> drop if foreign==1 & thousand>9
>> distplot thousand, over(foreign)
>>
>> So I would like the cdf line for foreign to continue at 100% up to 13 (which
>> is the maximum value in this dataset).
>>
>> All help on how to solve this is appreciated. Thanks.
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/