Nikolas wrote:
I am trying to create a variable that counts the number of nonmissing values
for another variable, but starts counting from the beginning when a missing
value is found.
In the folowing example I have an individual identifier (i) and a time
variable (t). I want to create var4 which counts the number of non missing
observations of x by I and t, but I want it to start counting when a missing
value appears on x.
+------------------+
i t x var4
------------------
1 1 1 1
1 2 0 2
1 3 1 3
1 4 . .
1 5 1 1
------------------
2 1 1 1
2 2 1 2
2 3 . .
2 4 0 1
2 5 0 2
+------------------+
------------------------------------------------------------
There may be some smarter way (time series, panel functions),
but the following seems to work. In case of missing x I temporarily
set var4 to 0, recoding it to missing in the end:
sort i t
gen var4=0
replace var4=1 if i>i[_n-1] & x<.
replace var4=1 if i==i[_n-1] & x[_n-1]==.
replace var4=1 if _n==1
replace var4=0 if x==.
replace var4=var4[_n-1]+1 if i==i[_n-1] & x<.
recode var4 (0=.)
Hope this helps
Svend
________________________________________________________
Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000 Aarhus C, Denmark
Phone, work: +45 8942 6090
Phone, home: +45 8693 7796
Fax: +45 8613 1580
E-mail: [email protected]
_________________________________________________________
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/