Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Variable running totals
From
"Schaffer, Mark E" <[email protected]>
To
<[email protected]>
Subject
RE: st: Variable running totals
Date
Fri, 1 Jun 2012 21:51:42 +0100
Thank you Nick and Jorge Eduardo! I will forward both your solutions to my colleague.
Cheers,
Mark
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: 01 June 2012 01:59
> To: [email protected]
> Subject: Re: st: Variable running totals
>
> Looping over observations is easier than might be thought. A
> discussion at
>
> SJ-7-3 pr0033 . . . . . . . . . . . . . . Stata tip 51:
> Events in intervals
> . . . . . . . . . . . . . . . . . . . . . . . . . . .
> . . . N. J. Cox
> Q3/07 SJ 7(3):440--443
> (no commands)
> tip for counting or summarizing irregularly spaced
> events in intervals
>
> is accessible at
> http://www.stata-journal.com/sjpdf.html?articlenum=pr0033
>
> In this case, consider
>
> * sandpit
>
> input id date count30
> 1 1000 1
> 1 1002 2
> 1 1002 3
> 1 1200 1
> 1 1250 1
> 2 1050 1
> 2 1059 2
> 2 1085 2
> end
>
> * solution code
>
> gen mycount30 = .
> qui forval i = 1/`=_N' {
> count if id == id[`i'] & inrange(date, date[`i'] - 30,
> date[`i'])
> replace mycount30 = r(N) in `i'
> }
>
> I suggest that this code is simpler than Jorge Eduardo's.
> Relative efficiency will depend on the number of identifiers
> and the number of observations (and, I suggest, on how long
> it takes to write code and revise it for related problems).
>
>
>
> Nick
>
> On Thu, May 31, 2012 at 9:27 PM, Schaffer, Mark E
> <[email protected]> wrote:
>
> > Hi all. "Variable running totals" isn't the best
> description of the
> > problem, but it's not too far off.
> >
> > A colleague has written to me with the following problem. He has a
> > panel dataset with two variables: id and date. (He has some other
> > variables but those are the two that matter.) There may be
> multiple
> > observations on id for a given date. The date variable is in Stata
> > %td format (#days after 01jan1960). So it looks like this:
> >
> > id date
> > 1 1000
> > 1 1002
> > 1 1002
> > 1 1200
> > 1 1250
> > 2 1050
> > 2 1059
> > 2 1085
> >
> > ...etc.
> >
> >
> > The question is, how to construct a variable that counts
> the number of
> > observations that an individual (id) appears in the dataset
> up to 30
> > days previously. If we call the variable count30, it would
> look like
> > this:
> >
> > id date count30
> > 1 1000 1
> > 1 1002 2
> > 1 1002 3
> > 1 1200 1
> > 1 1250 1
> > 2 1050 1
> > 2 1059 2
> > 2 1085 2
> >
> > ...etc.
> >
> > I suspect there's an easy way of doing this, but the only
> ways I could
> > think of involved brute force looping through observations.
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/