Svend's solution is good. Lovers of brevity will note
that it can be collapsed to a single line:
bysort pid : egen wave1 = max(wave == 1)
where -total()- will do as well as -max()-.
For a more general discussion, see a posting
on selection of panels on 2 April:
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.0704/Author/article-30.html
Suppose you have some response -y-. Its
value for wave 1 can be captured thus:
by pid : egen y1 = total(cond(wave == 1, y, .))
Here -max- and -min- would do as well as total.
If -wave- is never 1 in a panel, you just get missings
everywhere in that panel.
Under way is a revision of
How can I generate a variable relating panel data to a reference panel?
http://www.stata.com/support/faqs/stat/panelref.html
which extends the discussion to
How can I generate a variable relating panel data to a reference panel or time?
But that may not be ready for some weeks. However, the Stata logic
is the same, really.
Nick
[email protected]
Svend Juul
> Sara asked:
>
> 1. I have an unbalanced panel and wish to create a variable that
> identifies the presence of wave 1 respondents in subsequent waves in
> order to test for attrition.
>
> 2. I wish to distribute the value of wave 1 dependent variables in the
> unbalanced panel over subsequent waves.
>
> +-----------------+
> pid wave wave1
> -----------------
> 1. 10016872 2 0
> 2. 10016872 4 0
> -----------------
> 3. 10017992 1 1
> 4. 10017992 2 1
> 5. 10017992 3 1
> 6. 10017992 4 1
> 7. 10017992 6 1
> -----------------
> 00040404 2 0
> -----------------
> 19. 10040439 1 1
> 20. 10040439 2 1
>
> -------------------------------------------------
>
> This does it:
>
> recode wave(1=1)(*=0) , generate(wave1)
> by pid, sort: egen wave2=max(wave1)
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/