Your example indicates that gaps may exist in the data.
Subject 31 was not measured in various years e.g. 1987,
1988. Thus if you want classification according to
consecutive years you need to specify how missing
[meaning, not present in the data] values are to be
treated. Do you mean just consecutive tests?
In addition, it seems possible in principle that
an individual could be assigned to two or more
classes according to different parts of their history.
So, I am not clear that writing specific code is
the best answer to you until these ambiguities are
resolved. But if you go
. bysort id (dot) : gen t = _n
. tsset id t
you can look for spells in your data according to
your stated criteria. The user-written
program -tsspell- from SSC can be then used. It has
a fairly detailed help file.
Nick
[email protected]
Raphael Fraser
> I have a longitudinal data set that contains nearly 500 patients. All
> patients were tested at these times dot (date of test) for the level
> of protein in the blood; the result being 0 (no protein) T (trace
> amounts of protein), 1, 2, 3 or 4. I would like to classify these
> subjects based on the criteria below:
>
> "Minimal" if protein is T on at least 2 out of 3 consecutive years.
> "Sustained" if the result is minimal and lasts 3 years or more.
> "Heavy" if sustained with protein 2 or greater lasting 3
> years or more.
>
> id dot protein
> 31 15mar1985 T
> 31 14mar1986 0
> 31 15mar1989 T
> 31 15mar1990 T
> 31 15mar1991 T
> 31 15mar1993 0
> 31 18feb1994 T
> 31 07jun1995 0
> 31 23aug1996 1
> 31 10may1999 T
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/