Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Cumulative Frequencies within Groups
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: RE: Cumulative Frequencies within Groups
Date
Fri, 3 Aug 2012 23:47:42 +0100
See also -groups- (SJ and SSC).
Nick
On 3 Aug 2012, at 22:20, "Ben Hoen" <[email protected]> wrote:
As often happens...preparing the email prompted me to think
differently.
Here is the solution:
*=========== Begin ==================
sysuse auto, clear
g levels=(int(4*runiform()))+1 //to create an categorical variable
representing 4 groups
bys levels rep78 : gen freq = _N
bys levels rep78 : gen cumfreq = _N if _n == 1
bys levels: replace cumfreq = sum(cumfreq)
bys levels: tabdisp rep78, cell(freq cumfreq)
*============ End ====================
Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Ben Hoen
Sent: Friday, August 03, 2012 4:49 PM
To: [email protected]
Subject: st: Cumulative Frequencies within Groups
Hi all,
I have been unable to figure out how to create cumulative
frequencies within
groups.
This Nick Cox entry provided a great description for how to create
cumulative frequencies for a whole set of data:
http://www.stata.com/support/faqs/data-management/tabulating-cumulative-freq
uencies/
But I have not been able to apply the same logic to groups within my
dataset.
For the full dataset this code works perfectly (as Nick provided)
*======= Begin ===============
sysuse auto, clear
by rep78, sort: gen freq = _N
by rep78: gen cumfreq = _N if _n == 1
replace cumfreq = sum(cumfreq)
tabdisp rep78, cell(freq cumfreq)
* ========== End ==================
I would like to apply the same logic to groups of cases. Here is my
best
attempt.
* =========== Begin ===============
g levels=(int(4*runiform()))+1 //to create an categorical variable
representing 4 groups
su levels, meanonly
forvalues i = 1/`r(max)' { //max is used because often I will not
know the
number of groups
bys rep78: gen freqtemp=_N if levels==`i'
by rep78: gen cumfreqtemp=_N if _n==1 & levels==`i'
replace cumfreqtemp=sum(cumfreqtemp) if levels==`i'
replace freq=freqtemp if levels==`i'
replace cumfreq=cumfreqtemp if levels==`i'
drop freqtemp cumfreqtemp
}
*
bys levels: tabdisp rep78, cell(freq cumfreq)
* =========== End =====================
As one can see, my code has a problem that I cannot discern.
Any ideas?
Thanks, in advance,
Ben
Ben Hoen
Principal Research Associate
Lawrence Berkeley National Laboratory
Office: 845-758-1896
Cell: 718-812-7589
[email protected]
http://eetd.lbl.gov/ea/emp/staff/hoen.html
Visit our publications at:
http://eetd.lbl.gov/ea/ems/emp-pubs.html
Sign up for our email list to receive publication notifications:
http://eetd.lbl.gov/ea/emp/list/emp_pubs_signup.php
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/