Overlapping records are clearly a problem
for this code. It just adds up blindly.
Let's copy our end dates to -svcdate_end2-
. gen svcdate_end2 = svcdate
Now we sort
. sort enrolid svcdate svcdate_end2
Our problems arise if the previous
-svcdate_end2- is >= the current
-svcdate-. Or, reversing this,
if the next -svcdate- is <=
the present -svcdate_end2-.
That's what overlap is.
However, we can get confused
by intervals wholly within
previous intervals, so
. by enrolid : drop if svcdate >= svcdate[_n-1] &
svcdate_end2 <= svcdate_end2[_n-1]
Then we go
. by enrolid: replace svcdate_end2 =
svcdate[_n+1] - 1
if svcdate_end2 >= svcdate[_n+1]
and then work with the modified
copy.
Note: I am not sure this copes
with all possible quirks.
More important note: Somebody must
have solved this problem before!
Assertion: This should be soluble
without loops.
Nick
[email protected]
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of
> [email protected]
> Sent: 11 May 2004 20:38
> To: [email protected]
> Subject: Re: st: RE: forvalues within foreach?
>
>
> Thank you Nick. The dataset I am working with, however,
> contains non-disjoint records, can the approach you provided be
> modified to address overlapping service dates? When records do
> overlap, I do not want to double-count.
> --Clint Thompson
>
>
>
> On 11 May 2004 at 20:23, Nick Cox wrote:
>
> > In program 2 you have several problems.
> >
> > The local macro 0 contains what you type
> > after the program name, in your case
> > a (probably unexpanded) varlist. Although
> > the -syntax- statement will expand it,
> > that doesn't affect `0'. So the first step is
> > to go to
> >
> > program var_rep
> > version 8
> > syntax varlist(numeric)
> > local n 10
> > foreach var of local varlist {
> > forvalues i = 14610(1)`n' {
> > replace `var' = 1 if (`i' >= svcdate & `i' <= svcdate_end),
> > by(enrolid)
> > }
> > }
> > end
> >
> > But that still leaves two bugs that I can see:
> >
> > 1. 14610(1)10 won't go anywhere. You mean 14610/14619.
> >
> > 2. -replace- doesn't take a -by()- option.
> >
> > However, given your problem, a direct attack
> > is possible, I believe, without any loops whatsover or
> > indeed any programs whatsoever.
> >
> > Your structure appears to be
> >
> > enrolid svcdate svcdate_end
> >
> > First check that the dates are
> > the right way round in every case
> >
> > . assert svcdate <= svcdate_end
> >
> > Possibly you even have several
> > records for each person. That's no
> > problem, so long as they are disjoint.
> >
> > For each person, you want #days in
> > service between 1 Jan 2000 and
> > 31 Dec 2001. The length of relevant service is
> >
> > min(svcdate_end, mdy(12,31,2001))
> > -max(mdy(1,1,2000),svcdate)
> >
> > So I think what you want is
> >
> > gen cont = 1 + min(svcdate_end, mdy(12,31,2001)) -
> > max(mdy(1,1,2000),svcdate) egen sumservice = sum(cont), by(enrolid)
> >
> > Note the 1, based on the assumption that anyone who
> > arrived and left the same day is regarded as serving
> > 1 day, etc. Delete according to taste.
> >
> > Nick
> > [email protected]
> >
> > [email protected]
> >
> > > I have two small (and clumsy) programs wherein the objective is
> > > to create a variable for each day over a two year time frame
> > > (01Jan2000 - 31Dec2001) then assign the value 1 if the subject
> > > was on service, as defined by two variables: svcdate &
> > > svcdate_end. My programs are pasted below; the first one
> > > (var_gen) generates the variables as expected (note that I
> > > limited variable generation to just the first 10 days in
> 2000). The
> > > second program, however, executes when run but it does
> not return a 1
> > > where it should. I suspect that the problem may be with
> the forvalues
> > > loop in the foreach statement. Any advice or suggestions? My
> > > ultimate objective is to sum the total number of days
> each subject was
> > > on service over the two year period. Thank you. Clint
> Thompson
> > >
> > > Program #1:
> > > program var_gen
> > > version 8
> > > local N 10
> > > forvalues i = 1(1)`N' {
> > > gen day`i' = 0
> > > }
> > > end
> > >
> > >
> > > Program #2:
> > > program var_rep
> > > version 8
> > > syntax varlist(numeric)
> > > local n 10
> > > foreach var of local 0 {
> > > forvalues i = 14610(1)`n' {
> > > replace `var' = 1 if (`i' >= svcdate & `i' <=
> > > svcdate_end), by(enrolid)
> > > }
> > > }
> > > end
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/