Do you really just want the last observation? If you do, you are
throwing away information that individual 1 survived 3009 units of
time before (presumably) failure which is rather different to assuming
he/she/it survived 1003 units which is what would be implied if you
took only the last observation.
A simple collapse might be what you want
. list, sepby(id)
+------------------------------------+
| id year surviv~e status x1 |
|------------------------------------|
1. | 1 2000 200 0 3 |
2. | 1 2001 300 0 3 |
3. | 1 2002 50 0 3 |
|------------------------------------|
4. | 2 2002 70 0 6 |
5. | 2 2003 250 1 6 |
|------------------------------------|
6. | 3 2005 45 0 2 |
7. | 3 2006 154 0 2 |
8. | 3 2007 56 1 2 |
|------------------------------------|
9. | 4 2006 34 0 8 |
10. | 4 2007 45 1 8 |
+------------------------------------+
(not quite your data set but one I prepared earlier).
. collapse (sum) survivaltime (max) status x1, by(id)
. list
+-----------------------------+
| id surviv~e status x1 |
|-----------------------------|
1. | 1 550 0 3 |
2. | 2 320 1 6 |
3. | 3 255 1 2 |
4. | 4 79 1 8 |
+-----------------------------+
If you really want just the last record
. sort id year
. by id (year): gen last=(_n==_N)
. list, sepby(id)
+-------------------------------------------+
| id year surviv~e status x1 last |
|-------------------------------------------|
1. | 1 2000 200 0 3 0 |
2. | 1 2001 300 0 3 0 |
3. | 1 2002 50 0 3 1 |
|-------------------------------------------|
4. | 2 2002 70 0 6 0 |
5. | 2 2003 250 1 6 1 |
|-------------------------------------------|
6. | 3 2005 45 0 2 0 |
7. | 3 2006 154 0 2 0 |
8. | 3 2007 56 1 2 1 |
|-------------------------------------------|
9. | 4 2006 34 0 8 0 |
10. | 4 2007 45 1 8 1 |
+-------------------------------------------+
. dosomething if last==1
Cheers
Joseph
On Wed, Jul 8, 2009 at 7:02 AM, <[email protected]> wrote:
> Dear statalist,
>
> because I have only timeinvariant covariats I would like to reduce my
> panel data set (structure see below) to an cross sectional data set.
>
> The structure of my panel data set:
>
> Id year Status Survivaltime x1 x2
> 1 2000 0 1003
> 1 2001 0 1003
> 1 2002 1 1003
>
> 2 2003 0 298
> 2 2004 2 298
>
> 3 1998 0 3989
> 3 1999 0 3989
> 3 2000 0 3989
> 3 2001 0 3989
> 3 2002 0 3989
> 3 2003 0 3989
> 3 2004 0 3989
> 3 2005 0 3989
>
> Desired structure of data set:
>
> Id year Status Survivaltime x1 x2
> 1 2002 1 1003
>
> 2 2004 2 298
>
> 3 2005 0 3989
>
> I would like to keep only the last observation (which give me the outcome
> of survival or death). My problem is now, that the outcome does not occure
> always in the same year (as you can see it can happen in 2002 or 2005 or
> whenever between 1998 and 2008). The other problem I face is, that I can
> not use the command drop if status==0 because 0 can be also the desired
> outcome in the last year.
>
> Does anybody know how I could keep the last observation and drop all other
> observation?
>
> Thank you so much for your help.
>
> Best regards
> Frauke
>
>
>
>
> ___________________________________
>
> Frauke Rüther, Dipl.-Kffr., M.A.
> Research Associate
> University of St.Gallen
> Institute of Technology Management
> Dufourstrasse 40a
> CH - 9000 St. Gallen
>
> Phone +41 (0)71 224 7225
> Fax +41 (0)71 224 7301
> Email [email protected]
> Web www.item.unisg.ch
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/