Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: CDF and distplot
From
"Eric B." <[email protected]>
To
[email protected]
Subject
Re: st: CDF and distplot
Date
Thu, 18 Jul 2013 13:12:58 +0100
Dear Nick,
Thank you for your comments!
Best wishes,
Eric
On Wed, Jul 17, 2013 at 11:20 PM, Nick Cox <[email protected]> wrote:
> Here's some technique independently of -distplot-.
>
> sysuse auto.dta, clear
> gen thousand = floor(price/1000)
> drop if foreign & thousand>9
> bysort foreign : cumul thousand, gen(cumulative) equal
> set obs 75
> replace foreign = 1 in L
> replace thousand = 15 in L
> replace cumul = 1 in L
> twoway connected cumul thousand if !foreign, sort || ///
> connected cumul thousand if foreign, sort legend(order(1 "Domestic" 2
> "Foreign"))
> Nick
> [email protected]
>
>
> On 17 July 2013 21:47, Nick Cox <[email protected]> wrote:
>> -distplot- (SJ) is mine.
>>
>> You want it to show observations that don't exist, but with zero frequency.
>>
>> I have to say that I don't think it is a problem that -distplot- does
>> not do this, so I have no "solution" to offer.
>>
>>
>> Nick
>> [email protected]
>>
>>
>> On 17 July 2013 21:30, Eric B. <[email protected]> wrote:
>>> Dear Statalist users,
>>>
>>> I need help plotting a cumulative distribution function. After many attempts
>>> with other commands, I am using the user-written - distplot - as it allows
>>> flexibility in terms of configuring the look of the lines and graph.
>>>
>>> The variable I want to plot assumes integer values between 0 and 20. I need
>>> a cdf that plots two lines in the same graph by gender. While there are
>>> observations from 0 to 20 in the female sub-group, there are not
>>> observations higher than 10 for the male subgroup. The cdf line for the male
>>> subgroup reaches 100% at variable-value 10 and the line stops there. The cdf
>>> for the female subgroup continues up to 20. I would like both lines to
>>> continue through to 20, since is the range of the original data.
>>>
>>> In case I was not too clear, I have adapted the auto.dta data to show this
>>> issue and made the data on prices as an integer:
>>>
>>> sysuse auto.dta, clear
>>> gen thousand=3 if price < 4000
>>> replace thousand=4 if price < 5000 & price >=4000
>>> replace thousand=5 if price < 6000 & price >=5000
>>> replace thousand=6 if price < 7000 & price >=6000
>>> replace thousand=7 if price < 8000 & price >=7000
>>> replace thousand=8 if price < 9000 & price >=8000
>>> replace thousand=9 if price < 10000 & price >=9000
>>> replace thousand=10 if price < 11000 & price >=10000
>>> replace thousand=11 if price < 12000 & price >=11000
>>> replace thousand=12 if price < 13000 & price >=12000
>>> replace thousand=13 if price < 14000 & price >=13000
>>> drop if foreign==1 & thousand>9
>>> distplot thousand, over(foreign)
>>>
>>> So I would like the cdf line for foreign to continue at 100% up to 13 (which
>>> is the maximum value in this dataset).
>>>
>>> All help on how to solve this is appreciated. Thanks.
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/