Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: cut into groups
From
Joe Canner <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: RE: cut into groups
Date
Sun, 25 Aug 2013 18:36:23 +0000
Fanfan,
I see a couple of potential problems:
1. The syntax for the -cut()- function requires that the values in the -at()- option be separated by commas.
2. Even though the 30th percentile is zero, I don't think it makes sense to have two zeroes in your -at()- list and it may be causing problems. If you generally have distributions that are this skewed you may want to rethink your plan.
If you continue to have problems with -cut- you could also try the -recode()- function which does more or less the same thing.
All that said, it is probably not good practice to choose your cutpoints based on the current values of the quantiles, in case your data set changes or you want to use this for something else. Someone else might be able to recommend a way to feed the data from -pctile- into -cut- or -recode-. Another possibility is to sort your data and use _n==_N*0.3 and _n==_N*0.7 (or ttpr[trunc(_N*0.3)] and ttpr[trunc(_N*0.7)]) to get your cutpoints. (Sorry, I don't have access to Stata at the moment, otherwise I could give you more useful ideas.) But, if you have skewed distributions, you will probably need extra code to deal with cases where you have duplicate cutpoints.
Regards,
Joe Canner
Johns Hopkins University School of Medicine
________________________________________
From: [email protected] [[email protected]] on behalf of Zhang Fanfan [[email protected]]
Sent: Sunday, August 25, 2013 7:37 AM
To: [email protected]
Subject: st: cut into groups
Hi al,
I want to cut my data based on the var -ttpr-. While by using pctile and list pcitle, I got the following percentile:
+--------+
| p_ttpr |
|--------|
1. | 0 |
2. | 0 |
3. | 0 |
4. | 0 |
5. | 0 |
|--------|
6. | 0 |
7. | .01 |
8. | 2.481 |
9. | 27.044 |
10. | . |
+--------+
Then how can I use cut comments to get 0-30% and 70%-100% two groups?
egen gp=cut(ttpr),at(0 0 0.01 43039) cannot work.
Thanks
Fanfan.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/