Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: cut into groups
From 
 
Zhang Fanfan <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: RE: cut into groups 
Date 
 
Mon, 26 Aug 2013 17:07:38 +1000 
Thank you Joe.
Yes as what you said, I have made some improvements on my data. The previous one doesn't look so good.
在 26/08/2013,4:36 AM,Joe Canner <[email protected]> 写道:
> Fanfan,
> 
> I see a couple of potential problems:
> 
> 1. The syntax for the -cut()- function requires that the values in the -at()- option be separated by commas.
> 2. Even though the 30th percentile is zero, I don't think it makes sense to have two zeroes in your -at()- list and it may be causing problems.  If you generally have distributions that are this skewed you may want to rethink your plan. 
> 
> If you continue to have problems with -cut- you could also try the -recode()- function which does more or less the same thing.
> 
> All that said, it is probably not good practice to choose your cutpoints based on the current values of the quantiles, in case your data set changes or you want to use this for something else.  Someone else might be able to recommend a way to feed the data from -pctile- into -cut- or -recode-.  Another possibility is to sort your data and use _n==_N*0.3 and _n==_N*0.7 (or ttpr[trunc(_N*0.3)] and ttpr[trunc(_N*0.7)]) to get your cutpoints.  (Sorry, I don't have access to Stata at the moment, otherwise I could give you more useful ideas.)  But, if you have skewed distributions, you will probably need extra code to deal with cases where you have duplicate cutpoints.
> 
> Regards,
> Joe Canner
> Johns Hopkins University School of Medicine
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Zhang Fanfan [[email protected]]
> Sent: Sunday, August 25, 2013 7:37 AM
> To: [email protected]
> Subject: st: cut into groups
> 
> Hi al,
> 
> I want to cut my data based on the var -ttpr-. While by using pctile and list pcitle, I got the following percentile:
>   +--------+
>     | p_ttpr |
>     |--------|
>  1. |      0 |
>  2. |      0 |
>  3. |      0 |
>  4. |      0 |
>  5. |      0 |
>     |--------|
>  6. |      0 |
>  7. |    .01 |
>  8. |  2.481 |
>  9. | 27.044 |
> 10. |      . |
>     +--------+
> 
> Then how can I use cut comments to get 0-30% and 70%-100% two groups?
> 
> egen gp=cut(ttpr),at(0 0 0.01 43039) cannot work.
> 
> 
> Thanks
> Fanfan.
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/