| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Count non-missing
In other terminology, Nikolaos wants to identify, and to give sequence
numbers to, spells of non-missing values. When his -x- is missing,
Nikolaos wants the counter to be missing; when it is non-missing, he
wants the counter to go 1, 2, 3, ... .
The principles of identifying spells will be discussed in Stata Journal
7(2), out in about 5 weeks' time. Alternatively, -tsspell- from SSC is
one user-written tool in this territory. But you can get there directly
in at most two lines of Stata.
1. Consider the first non-missing value in each spell. Then
!missing(x) & missing(x[_n-1]) (*)
will evaluate to 1 for the observation with that value. There are
two true-or-false conditions here:
!missing(x) this value of x is not missing
and
missing(x[_n-1]) the previous value of x is missing.
Jointly, these two conditions define the first non-missing value in a
spell.
Otherwise, (*) will evaluate to 0.
This criterion applies also to any non-missing value that is the first
observed, as it then becomes
!missing(x[1])& missing(x[0])
This is not problematic, as any varname[0] is evaluated as missing.
-missing(x)- and -x < .- are equivalent, as any-non-missing value is
deemed to be less than any missing value. Sometimes, we want to write
-x < .-, just for brevity.
2. We should set up a count
gen seq = .
3. Now the key step is
replace seq = cond(!missing(x) & missing(x[_n-1]), 1, seq[_n-1] + 1)
if x < .
That is
(a) if this observation contains the first non-missing value of a spell,
set the count to 1
(b) otherwise, take the previous count and add 1.
4. Nikos wants to do this for panel data, but the generalisation is
easy. Here it is:
----------------------------------------------- NJC solution
gen seq = .
bysort i (t) : replace seq =
cond(!missing(x) & missing(x[_n-1], 1, seq[_n-1] + 1) if x < .
-----------------------------------------------
Now compare Svend Juul's solution.
----------------------------------------------- Svend Juul solution
sort i t
gen var4=0
replace var4=1 if i>i[_n-1] & x<.
replace var4=1 if i==i[_n-1] & x[_n-1]==.
replace var4=1 if _n==1
replace var4=0 if x==.
replace var4=var4[_n-1]+1 if i==i[_n-1] & x<.
recode var4 (0=.)
------------------------------------------------
The principles are the same.
Nick
[email protected]
Nikolaos Kanellopoulos
I am trying to create a variable that counts the number of nonmissing
values for another variable, but starts counting from the beginning when
a missing value is found.
In the folowing example I have an individual identifier (i) and a time
variable (t). I want to create var4 which counts the number of non
missing observations of x by I and t, but I want it to start counting
when a missing value appears on x.
+------------------+
i t x var4
------------------
1 1 1 1
1 2 0 2
1 3 1 3
1 4 . .
1 5 1 1
------------------
2 1 1 1
2 2 1 2
2 3 . .
2 4 0 1
2 5 0 2
+------------------+
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/