Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: capturing the sizes of the sequences of countinous (uninterrupted) values equal to 1
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: capturing the sizes of the sequences of countinous (uninterrupted) values equal to 1
Date
Wed, 30 Nov 2011 09:52:15 +0000
Sorry; previous post was sent too soon.
Toy example using -tsspell- (SSC). What you want is I think created
here as _seq, except that you need to subtract 1.
clear
set obs 10
gen id = _n
forval j = 1/5 {
gen time`j' = runiform() < 0.7
}
reshape long time , i(id)
rename time state
rename _j time
tsset id time
tsspell, cond(state==1)
. l
+------------------------------------------+
| id time state _seq _spell _end |
|------------------------------------------|
1. | 1 1 1 1 1 0 |
2. | 1 2 1 2 1 0 |
3. | 1 3 1 3 1 1 |
4. | 1 4 0 0 0 0 |
5. | 1 5 0 0 0 0 |
|------------------------------------------|
6. | 2 1 1 1 1 0 |
7. | 2 2 1 2 1 0 |
8. | 2 3 1 3 1 0 |
9. | 2 4 1 4 1 0 |
10. | 2 5 1 5 1 1 |
|------------------------------------------|
11. | 3 1 1 1 1 0 |
12. | 3 2 1 2 1 0 |
13. | 3 3 1 3 1 0 |
14. | 3 4 1 4 1 0 |
15. | 3 5 1 5 1 1 |
|------------------------------------------|
16. | 4 1 1 1 1 1 |
17. | 4 2 0 0 0 0 |
18. | 4 3 0 0 0 0 |
19. | 4 4 1 1 2 0 |
20. | 4 5 1 2 2 1 |
|------------------------------------------|
21. | 5 1 1 1 1 0 |
22. | 5 2 1 2 1 1 |
23. | 5 3 0 0 0 0 |
24. | 5 4 1 1 2 1 |
25. | 5 5 0 0 0 0 |
|------------------------------------------|
26. | 6 1 1 1 1 1 |
27. | 6 2 0 0 0 0 |
28. | 6 3 1 1 2 0 |
29. | 6 4 1 2 2 0 |
30. | 6 5 1 3 2 1 |
|------------------------------------------|
31. | 7 1 1 1 1 0 |
32. | 7 2 1 2 1 0 |
33. | 7 3 1 3 1 1 |
34. | 7 4 0 0 0 0 |
35. | 7 5 1 1 2 1 |
|------------------------------------------|
36. | 8 1 1 1 1 0 |
37. | 8 2 1 2 1 1 |
38. | 8 3 0 0 0 0 |
39. | 8 4 1 1 2 1 |
40. | 8 5 0 0 0 0 |
|------------------------------------------|
41. | 9 1 1 1 1 0 |
42. | 9 2 1 2 1 0 |
43. | 9 3 1 3 1 0 |
44. | 9 4 1 4 1 0 |
45. | 9 5 1 5 1 1 |
|------------------------------------------|
46. | 10 1 0 0 0 0 |
47. | 10 2 0 0 0 0 |
48. | 10 3 1 1 1 0 |
49. | 10 4 1 2 1 1 |
50. | 10 5 0 0 0 0 |
+------------------------------------------+
> On Wed, Nov 30, 2011 at 9:36 AM, Nick Cox <[email protected]> wrote:
>> You can't get this information given your data structure into a single
>> Stata variable. What you seek is a matrix.
>>
>> If w <= 244, you could try concatenating your variables into a string
>> variable holding individuals' history.
>>
>> But I guess this would be easier after -reshape long-. Then a spell is
>> defined as a sequence with all 1s for the same id. See then
>>
>> SJ-7-2 dm0029 . . . . . . . . . . . . . . Speaking Stata: Identifying spells
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
>> Q2/07 SJ 7(2):249--265 (no commands)
>> shows how to handle spells with complete control over
>> spell specification
>>
>> tsspell from http://fmwww.bc.edu/RePEc/bocode/t
>> 'TSSPELL': module for identification of spells or runs in time series /
>> tsspell examines the data, which must be tsset time series, to / identify
>> spells or runs, which are contiguous sequences defined / by some
>> condition. tsspell generates new variables indicating / distinct spells,
>>
>> Nick
>>
>> On Wed, Nov 30, 2011 at 9:24 AM, massimiliano stacchini
>> <[email protected]> wrote:
>>
>>> I have a huge dataset. The rows identify the person ID (i) (i=1,...,n) while in columns there are the reference dates TIME(t) (t=1,...,w). Each cells contain the value 1 or 0 (zero), alternatively.
>>>
>>> I should create a variable (LENGTH) varying both over ID and TIME.
>>> For each i of ID(i) in t of TIME(t), LENGTH should captures the number of continuous (uninterrupted) values which are equal to 1 in the interval of cells starting from the reference data t of TIME and moving backwards to the previous reference dates.
>>> In other terms , LENGTH should capture for each (i) of ID and for each (t) of TIME the number of s in T (t-s) identifying cells having values equal to 1 (i.e., the size of the sequence of uninterrupted 1 moving backwards to the previous reference dates).
>>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/