Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Variable running totals
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Variable running totals
Date
Fri, 1 Jun 2012 01:58:34 +0100
Looping over observations is easier than might be thought. A discussion at
SJ-7-3 pr0033 . . . . . . . . . . . . . . Stata tip 51: Events in intervals
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q3/07 SJ 7(3):440--443 (no commands)
tip for counting or summarizing irregularly spaced
events in intervals
is accessible at http://www.stata-journal.com/sjpdf.html?articlenum=pr0033
In this case, consider
* sandpit
input id date count30
1 1000 1
1 1002 2
1 1002 3
1 1200 1
1 1250 1
2 1050 1
2 1059 2
2 1085 2
end
* solution code
gen mycount30 = .
qui forval i = 1/`=_N' {
count if id == id[`i'] & inrange(date, date[`i'] - 30, date[`i'])
replace mycount30 = r(N) in `i'
}
I suggest that this code is simpler than Jorge Eduardo's. Relative
efficiency will depend on the number of identifiers and the number of
observations (and, I suggest, on how long it takes to write code and
revise it for related problems).
Nick
On Thu, May 31, 2012 at 9:27 PM, Schaffer, Mark E <[email protected]> wrote:
> Hi all. "Variable running totals" isn't the best description of the
> problem, but it's not too far off.
>
> A colleague has written to me with the following problem. He has a
> panel dataset with two variables: id and date. (He has some other
> variables but those are the two that matter.) There may be multiple
> observations on id for a given date. The date variable is in Stata %td
> format (#days after 01jan1960). So it looks like this:
>
> id date
> 1 1000
> 1 1002
> 1 1002
> 1 1200
> 1 1250
> 2 1050
> 2 1059
> 2 1085
>
> ...etc.
>
>
> The question is, how to construct a variable that counts the number of
> observations that an individual (id) appears in the dataset up to 30
> days previously. If we call the variable count30, it would look like
> this:
>
> id date count30
> 1 1000 1
> 1 1002 2
> 1 1002 3
> 1 1200 1
> 1 1250 1
> 2 1050 1
> 2 1059 2
> 2 1085 2
>
> ...etc.
>
> I suspect there's an easy way of doing this, but the only ways I could
> think of involved brute force looping through observations.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/