Comments on three levels:
1. Don't do this. If you do this, you lock
yourself into needing to exclude such records from
most later analyses. Unless memory is a real
issue, have a variable containing this
information, i.e. store it in an extra column,
not extra rows. Even though values are repeated
for every observation in each panel, that is
usually not a problem and indeed often helpful.
So to get what you need, without changing
the number of records,
egen died = max(year), by(id)
replace died = died + 1
2. If you don't accept my advice, you need
something like this:
gen byte died = 0
egen group = group(id)
su group, meanonly
qui forval g = 1/`r(max)' {
su year if group == `g', meanonly
expand 2 if group == `g' & year == r(max)
replace year = r(max) + 1 in l
replace died = 1 in l
}
drop group
sort id year
edit id year died
3. Really, don't do this. See #1.
Nick
[email protected]
xiao yi
> I have an unbalanced panel data(cross-setion and
> time-series) which looks like below:
>
> ID year x1 x2
> 1 1993 .. ..
> 1 1994 .. ..
> 1 1995 .. ..
> 2 1995 .. ..
> 2 1996 .. ..
> 2 1997 .. ..
> 3 1993 .. ..
> 3 1994 .. ..
> 3 1995 .. ..
>
> The data is about firms' survival analysis. The last
> year of my sample is 2003. In the example above, firm
> 1 dies in 1996 because the last year of this firm
> is 1995 and firm 2 dies in 1998, and so on. I want to
> add a record at the end of each subject (firm)
> indicating the year when the fird dies. Can anyone
> tell me how to accomplish this? Thank you in advance.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/