Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: spells of missing values completely in between
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Re: spells of missing values completely in between
Date
Sun, 8 Apr 2012 00:08:57 +0100
I don't know about elegant, but here is one approach.
First identify spells of missing values:
. tsspell, cond(missing(var2))
. l
+------------------------------------+
| var1 var2 _seq _spell _end |
|------------------------------------|
1. | 1 . 1 1 0 |
2. | 2 . 2 1 0 |
3. | 3 . 3 1 1 |
4. | 4 56 0 0 0 |
5. | 5 . 1 2 0 |
|------------------------------------|
6. | 6 . 2 2 0 |
7. | 7 . 3 2 1 |
8. | 8 95 0 0 0 |
9. | 9 . 1 3 1 |
10. | 10 20 0 0 0 |
|------------------------------------|
11. | 11 . 1 4 0 |
12. | 12 . 2 4 1 |
+------------------------------------+
Reclassify if the first spell of missing values is at the start.
. replace _spell = 0 if sum(missing(var2)) == _n
(3 real changes made)
Reverse time and do the same with the last spell (now the first).
. gsort -var1
. replace _spell = 0 if sum(missing(var2)) == _n
(2 real changes made)
Now _spell is positive if and only if you have a spell of missing
values in the middle.
. sort var1
. l
+------------------------------------+
| var1 var2 _seq _spell _end |
|------------------------------------|
1. | 1 . 1 0 0 |
2. | 2 . 2 0 0 |
3. | 3 . 3 0 1 |
4. | 4 56 0 0 0 |
5. | 5 . 1 2 0 |
|------------------------------------|
6. | 6 . 2 2 0 |
7. | 7 . 3 2 1 |
8. | 8 95 0 0 0 |
9. | 9 . 1 3 1 |
10. | 10 20 0 0 0 |
|------------------------------------|
11. | 11 . 1 0 0 |
12. | 12 . 2 0 1 |
+------------------------------------+
If it's important to you that the spells are numbered 1 up, you can
re-number them.
. egen spell = group(_spell) if _spell
(8 missing values generated)
. l
+--------------------------------------------+
| var1 var2 _seq _spell _end spell |
|--------------------------------------------|
1. | 1 . 1 0 0 . |
2. | 2 . 2 0 0 . |
3. | 3 . 3 0 1 . |
4. | 4 56 0 0 0 . |
5. | 5 . 1 2 0 1 |
|--------------------------------------------|
6. | 6 . 2 2 0 1 |
7. | 7 . 3 2 1 1 |
8. | 8 95 0 0 0 . |
9. | 9 . 1 3 1 2 |
10. | 10 20 0 0 0 . |
|--------------------------------------------|
11. | 11 . 1 0 0 . |
12. | 12 . 2 0 1 . |
+--------------------------------------------+
See also http://www.stata.com/support/faqs/data/dropmiss.html
If memory serves me right, Gary Longton suggested the criterion
sum(missing(varname)) == _n for spells of missings at the beginning of
the data.
On Sat, Apr 7, 2012 at 12:01 PM, Abhimanyu Arora
<[email protected]> wrote:
> Somehow I feel the step in which the temp is generated can be omitted
> by a clever use of the -cond- option.
>
> On Sat, Apr 7, 2012 at 12:57 PM, Abhimanyu Arora
> <[email protected]> wrote:
>> Dear statalist
>> I was wondering if there is a more elegant solution to one below
>> involving SSC's tsspell by Nick Cox
>> . which tsspell
>> c:\ado\plus\t\tsspell.ado
>> *! 2.0.0 NJC 13 August 2002
>>
>> The aim is to create an indicator for spells of missing values
>> completely in between a series (excluding those at the beginning or
>> the end). (A part in the process of intrapolating a time series,
>> basically)
>>
>> I used the following set of commands.
>>
>>
>> clear
>> set obs 12
>> gen var1=_n
>> tsset var1
>> input var2
>> .
>> .
>> .
>> 56
>> .
>> .
>> .
>> 95
>> .
>> 20
>> .
>> end
>> tsspell var2
>> egen temp=max( _spell)
>> gen ind=(var2==. & _spell==1)|(var2==. & temp==_spell)| (var2!=.)
>>
>> Cheers
>> Abhimanyu
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/