[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Survival time to prevalence data - efficient code?

From	"Arne Kolstad" <[email protected]>
To	<[email protected]>
Subject	st: Survival time to prevalence data - efficient code?
Date	Tue, 9 Sep 2003 00:05:30 +0200

I have survival time data about sickness spells, in the following form:

personid     startdate     stopdate
1            01mai1997    07dec1997
1            28jan2002    09feb2002
2            31jul1994    06mar1998
.
.
N            31dec2002    (sensored)

---


What I need is a table a) with prevalences for each day :

month            spersons
01jan1994            897
02jan1994            789
.
.
31dec2002            987

---

and a table b) of person-days of sickness for each month through the period
of interest:


month              pdays
jan1994            22345
feb1994            24567
.
.
dec2002            26789

---


I believe I will have my a) data set thusly:

forvalues x=12419/15705 {
quietly stdes if startdate<=`x' & stopdate>`x'
di r[N_sub]
}

So to the real problem: The data set has more than 5 million records.
Looping through thousands of days is slow, partly because stdes doea a lot
of work, and I need to repeat it a lot of times as different versions of the
data are produced. Is there a more efficient method?

Follow-Ups:
- st: RE: Survival time to prevalence data - efficient code?
  - From: "Nick Cox" <[email protected]>

Prev by Date: Re: st: help merge
Next by Date: [no subject]
Previous by thread: Re: st: 3sls, selection
Next by thread: st: RE: Survival time to prevalence data - efficient code?
Index(es):
- Date
- Thread