Adapting a Michael Blasnik dodge, we fill in
all gaps and sum over (at most) the last 365
days:
bysort id (date) : gen diff = date[_n+1] - date
local N = _N
expand diff
gen byte original = _n <= `N'
replace vol = 0 if !original
bysort id date : replace date = date + _n + 1
by id : gen X = sum(vol)
by id : replace X = X - max(0, X[_n-365])
keep if original
drop original diff
Notes:
1. No special code for leap years.
2. No checking for duplicate days within ids.
I've not thought through whether that's
a problem.
3. It would be nice to do this without
changing the data structure, even temporarily.
Nick
[email protected]
> -----Original Message-----
Jitian Sheu
> My question is as following:
>
> Suppose I have data as following:
>
> obs id date vol X
> 1 1 01 jan 1995 a -
> 2 1 28 feb 1995 b a
> 3 1 31 oct 1996 c a
> 4 1 25 dec 1996 d d
> 5 1 29 dec 1996 e c+d
> 6 1 15 nov 1997 f c+d
> 7 2 01 jan 1994 g -
> 8 2 10 jan 1994 h g
> 9 2 25 jan 1994 i g+h
> 10 2 3 feb 1994 j g+h+i
> 11 2 28 feb 1994 k g+h+i+j
> 12 2 01 jan 1995 l g+h+i+j+k
> 13 2 11 jan 1995 m g+h+i+j+k+l
>
>
> What I want is a new varialbe, say call X
> Where X is the summation of "VOL" for the same ID, but, the
> summation only
> count for one year behind the date of underlying observation.
>
> For example, take a look for observation #5.
> This observation has date ="29 Dec 1996" (for ID=1)
> The value of this new variable = all "VOL" occured between
> "29 Dec 1995" -
> "28 Dec 1996".
> In this case the value is "c+d"
>
> Take another example.
> Let's see the observation #10.
> This observation has date= "3 Feb 1994"
> I want the X to be the summation of all VOL occured between
> "3 Feb 1993" to
> "2 Feb 1994", ===> =g+h+i
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/