Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: add up variable / quantile |
Date | Thu, 14 Apr 2011 07:17:29 +0100 |
Good. I had in my mind "Why not use -xtile- anyway?", but sorry, that didn't make it to my previous post. Nick On Wed, Apr 13, 2011 at 10:26 PM, Scharnigg, Stan (Stud. SBE) <s.scharnigg@student.maastrichtuniversity.nl> wrote: > Thank you, that was indeed the problem. I solved it with the xtile command. > ________________________________________ > Van: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] namens Nick Cox [n.j.cox@durham.ac.uk] > Verzonden: woensdag 13 april 2011 22:07 > Aan: 'statalist@hsphsun2.harvard.edu' > Onderwerp: RE: st: add up variable / quantile > > I don't follow this well, but I see that you are including a comparison with r(p75), which is presumably thought to be left over from some previous command. > > However, r-class results are ephemeral and don't stick around forever. In particular, they get overwritten by -egen-, which does its own internal -count- at some point. > > If, however, r(p75) is undefined, then it's treated as missing and your comparison would be whether values were missing or greater than that, which is evidently not true for your data. > > I rarely understand why people who have quantitative data want to degrade it to indicator variables. That sounds most unstatistical to me. > > That aside, I think you need to reissue whatever command produced the r(p75) before you try to use it. > > Nick > n.j.cox@durham.ac.uk > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Scharnigg, Stan (Stud. SBE) > Sent: 13 April 2011 20:42 > To: statalist@hsphsun2.harvard.edu > Subject: RE: st: add up variable / quantile > > I still have a problem with this. > > My goal is to identify the top quantile for each years (total 6 years) in different variables > > What I did so far: > ----------------------------------------------------------- > gen year=year(newdate) // create years instead of normal dates > egen gross_performance_years=total(gross_performance), by(accountID year) // create gross_performance per year > egen tag_year=tag(RekeningID year) // tag a 1 for each year per accountID > ----------------------------------------------------------- > > this all works fine, however now I want to create the top quantile variable for each year. So i did the following: > > gen topq_2000=gross_performance_years >=r(p75) & year==y(2000), however this doesn't work. I only get "0" as value. > > I also tried this: > > generate topq_2000 = 0 > replace topq_2000 = 1 if gross_performance_year >=r(p75) & tag_year==1 & year==2002 > > but without succes > Does anybody has some tips how I can do this? > > ________________________________________ > Van: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] namens Nick Cox [njcoxstata@gmail.com] > Verzonden: dinsdag 29 maart 2011 12:44 > Aan: statalist@hsphsun2.harvard.edu > Onderwerp: Re: st: add up variable / quantile > > On Tue, Mar 29, 2011 at 11:30 AM, Scharnigg, Stan (Stud. SBE) > <s.scharnigg@student.maastrichtuniversity.nl> wrote: >> Look at the help for -egen-. You want >> >> egen total_gross = total(gross), by(accountID) >> ----------- >> >> Thank you, but I have some additional questions: >> >> A. I have data for 6 years (72 months). What if I want to add up the gross_performance for e.g the first 12 months. So, I guess I need to >> create different variables for different time periods, but I am not sure how to do that. One extensive possibility might be that I create a different dataset >> for every period, but I guess there might also be another solution >> >> accountID; gross_performance; date >> 1 -.1 jan_00 >> 1 .2 febr_00 >> 3 .1 jan_00 >> 3 .1 febr_00 > > You can specify -if- on an -egen- command. Different summaries will in > general require new variables (but not new datasets). > >> B. If I use egen total_gross = total(gross_performance), by(accountID) I get many duplicate values. In some cases I have >> 72 duplicate values. What is the best way to delete the duplicate values, so that they won't show up if I do some tests. I don't >> think that renaming them to "0" is an option then. > > You can use > > egen tag = tag(accountID) > > and then add > > ... if tag > > to commands to ensure that each summary is used once only. You cannot > delete (in Stata -drop-) without losing other information. > > Alternatively, -collapse- will yield a reduced dataset with one > observation for each account. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/