Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: Variable running totals |
Date | Fri, 1 Jun 2012 21:51:42 +0100 |
Thank you Nick and Jorge Eduardo! I will forward both your solutions to my colleague. Cheers, Mark > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox > Sent: 01 June 2012 01:59 > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: Variable running totals > > Looping over observations is easier than might be thought. A > discussion at > > SJ-7-3 pr0033 . . . . . . . . . . . . . . Stata tip 51: > Events in intervals > . . . . . . . . . . . . . . . . . . . . . . . . . . . > . . . N. J. Cox > Q3/07 SJ 7(3):440--443 > (no commands) > tip for counting or summarizing irregularly spaced > events in intervals > > is accessible at > http://www.stata-journal.com/sjpdf.html?articlenum=pr0033 > > In this case, consider > > * sandpit > > input id date count30 > 1 1000 1 > 1 1002 2 > 1 1002 3 > 1 1200 1 > 1 1250 1 > 2 1050 1 > 2 1059 2 > 2 1085 2 > end > > * solution code > > gen mycount30 = . > qui forval i = 1/`=_N' { > count if id == id[`i'] & inrange(date, date[`i'] - 30, > date[`i']) > replace mycount30 = r(N) in `i' > } > > I suggest that this code is simpler than Jorge Eduardo's. > Relative efficiency will depend on the number of identifiers > and the number of observations (and, I suggest, on how long > it takes to write code and revise it for related problems). > > > > Nick > > On Thu, May 31, 2012 at 9:27 PM, Schaffer, Mark E > <M.E.Schaffer@hw.ac.uk> wrote: > > > Hi all. "Variable running totals" isn't the best > description of the > > problem, but it's not too far off. > > > > A colleague has written to me with the following problem. He has a > > panel dataset with two variables: id and date. (He has some other > > variables but those are the two that matter.) There may be > multiple > > observations on id for a given date. The date variable is in Stata > > %td format (#days after 01jan1960). So it looks like this: > > > > id date > > 1 1000 > > 1 1002 > > 1 1002 > > 1 1200 > > 1 1250 > > 2 1050 > > 2 1059 > > 2 1085 > > > > ...etc. > > > > > > The question is, how to construct a variable that counts > the number of > > observations that an individual (id) appears in the dataset > up to 30 > > days previously. If we call the variable count30, it would > look like > > this: > > > > id date count30 > > 1 1000 1 > > 1 1002 2 > > 1 1002 3 > > 1 1200 1 > > 1 1250 1 > > 2 1050 1 > > 2 1059 2 > > 2 1085 2 > > > > ...etc. > > > > I suspect there's an easy way of doing this, but the only > ways I could > > think of involved brute force looping through observations. > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Heriot-Watt University is the Sunday Times Scottish University of the Year 2011-2012 Heriot-Watt University is a Scottish charity registered under charity number SC000278. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/