Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: re: data row transformation for irregular consecutive days
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: re: data row transformation for irregular consecutive days
Date
Tue, 23 Feb 2010 11:59:52 -0000
As Kit underlines, -tsspell- (from SSC) requires -tsset- data, but they must be correctly -tsset-!
As your data are panel data, you need to declare identifier and time variables. This example shows that, once you have done that, issuing -tsspell- using an example from the help file will identify spells defined by consecutive times.
. l
+-----------+
| id time |
|-----------|
1. | 1 1 |
2. | 1 2 |
3. | 1 3 |
4. | 1 5 |
5. | 1 6 |
|-----------|
6. | 2 1 |
7. | 2 2 |
8. | 2 8 |
9. | 2 9 |
+-----------+
. tsset id time
panel variable: id (unbalanced)
time variable: time, 1 to 9, but with gaps
delta: 1 unit
. tsspell, f(L.time == .)
. l
+----------------------------------+
| id time _spell _seq _end |
|----------------------------------|
1. | 1 1 1 1 0 |
2. | 1 2 1 2 0 |
3. | 1 3 1 3 1 |
4. | 1 5 2 1 0 |
5. | 1 6 2 2 1 |
|----------------------------------|
6. | 2 1 1 1 0 |
7. | 2 2 1 2 1 |
8. | 2 8 2 1 0 |
9. | 2 9 2 2 1 |
+----------------------------------+
Nick
[email protected]
Kaspar Dardas
Hi Kit & Nick,
thanks a lot. The solution almost worked. However, for some _spell
values I receive too many observations. As you can see the top _spell
has three observations (1 1 1), however, there can only be two (1 1).
I cannot explain why some dates are "grouped" in the same _spell. Most
of them are correct but some are incorrectly grouped. Did I do
something wrong? ( I have sorted my data by symbol and date
furthermore I have used the below code).
symbol days date en _spell _seq _end
3IN 04/02/2010 18297 1 1 1 0
3IN 05/02/2010 18298 2 1 2 0
888 12/05/2006 16933 3 1 3 1
888 15/05/2006 16936 4 2 1 1
888 25/09/2006 17069 5 3 1 0
888 26/09/2006 17070 6 3 2 0
888 27/09/2006 17071 7 3 3 0
888 28/09/2006 17072 8 3 4 1
888 03/10/2006 17077 9 4 1 0
gen date = date(days, "DMY")
sort symbol date
g en = _n
tsset en
tsspell date, fcond(D.date>1)
bys _spell: g sdate = date if _seq==1
bys _spell: g ndate = date if _end
collapse sdays ndays, by(symbol _spell)
Thanks,
Kaspar
2010/2/22 Kit Baum <[email protected]>:
> <>
> Kaspar said
>
> Is there a fast way in Stata 11 to do this data transformation?
>
> What I have:
> symbol days
> AAL 04-10-2004
> AAL 10-01-2005
> AAL 11-01-2005
> AAL 12-01-2005
> AAL 01-04-2005
> AAL 04-04-2005
> AAL 06-06-2005
> AAL 07-06-2005
> AAL 08-06-2005
>
> What I need:
> AAL 04-10-2004 04-10-2004
> AAL 10-01-2005 12-01-2005
> AAL 01-04-2005 01-04-2005
> AAL 04-04-2005 04-04-2005
> AAL 06-06-2005 08-06-2005
>
>
> g date = date(var2,"DMY")
> g en = _n
> tsset en
> // requires N J Cox -tsspell- from SSC (findit tsspell)
> tsspell date, fcond(D.date>1)
> bys _spell: g sdate = date if _seq==1
> bys _spell: g ndate = date if _end
> l
> collapse sdate ndate, by(var1 _spell)
> format sdate %td
> format ndate %td
> l
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/