At first sight, it looks as if this is a problem solved by -panelthin- on SSC.
The existence of two dates is not problematic, as they are a fixed time apart.
As I understand it, Kaspar needs to declare his data to be panel data with identifier something like
egen id = group(companyname name)
tsset id eventdate
and then apply -panelthin-.
Nick
[email protected]
Kaspar Dardas
thanks for your promt answer. However, this won't solve my problem.
Since I did not describe the problem properly its all on me. Your
solution is perfect for what I have described.
First, I use the -by- groups companyname AND name. I need both since
the same name can appear in a different company,as well. I thought
that implementing by groups would be easier but with your solution it
not possible- or, I dont know how to handle it...
Furthermore, a situation might appear where two overlaps exist. (see
trade 6 and 7). Trade 6 and 7 have to be dropped. However trade 8 is
fine since it does not interfere with trade 5. I dont think your
solution will caputre this, or will it?
> 1 "Company A" "Margetts, Rob" 14-Feb-03 06-Mar-03
> 2 "Company A" "Margetts, Rob" 17-Feb-03 09-Mar-03
> 3 "Company A" "Margetts, Rob" 14-Nov-03 04-Dec-03
> 4 "Company A" "Margetts, Rob" 14-May-04 03-Jun-04
> 5 "Company B" "Michael, Ham" 06-Aug-04 26-Aug-04
> 6 "Company B" "Michael, Ham" 07-Aug-04 27-Aug-04
> 7 "Company B" "Michael, Ham" 15-Aug-04 02-Sep-04
> 8 "Company B" "Michael, Ham" 28-Aug-04 08-Sep-04
2009/11/24 Martin Weiss <[email protected]>:
> No sign so far of which -by- groups you want, although I can see that they are relevant here. For the simple example, this is possible code:
>
>
>
> *************
> clear*
>
> inp byte trade companyname:mylab name:mylab2 /*
> */ str10(eventdate postdays), auto
> 1 "Company A" "Margetts, Rob" 14-Feb-03 06-Mar-03
> 2 "Company A" "Margetts, Rob" 17-Feb-03 09-Mar-03
> 3 "Company A" "Margetts, Rob" 14-Nov-03 04-Dec-03
> 4 "Company A" "Margetts, Rob" 14-May-04 03-Jun-04
> 5 "Company A" "Margetts, Rob" 06-Aug-04 26-Aug-04
> end
>
> compress
>
> gen eventdate2=date(eventdate, "DM20Y")
> format eventdate2 %tdMonth_DD,_CCYY
> gen postdays2=date(postdays, "DM20Y")
> format postdays2 %tdMonth_DD,_CCYY
>
> drop eventdate postdays
>
> gen byte overlap= /*
> */ eventdate2 < postdays2[_n-1] /*
> */ in 2/`=_N'
>
> //not just "overlap" as missing in first obs
> drop if overlap==1
>
> list, noobs
> *************
Kaspar Dardas
> I have the following problem. I need to create a non-overlapping event
> data set. I have two date variables 1st: "eventdate" which is the
> start of the event and 2nd: "postdays", which is the end of the event.
> The length of the event is exactly 20 days (which is, of course, the
> difference between "eventdate" and "postdays"). If two events overlap
> (which happens in trade 1 and 2, in the example below) I only take
> the first event and drop the 2nd (trade 2).
> How can I do this in Stata? I have about 140000 trades, several
> thousands companies etc....
>
> trade companyname name eventdate
> postdays
> 1 Company A Margetts, Rob 14-Feb-03
> 06-Mar-03
> 2 Company A Margetts, Rob 17-Feb-03
> 09-Mar-03
> 3 Company A Margetts, Rob 14-Nov-03
> 04-Dec-03
> 4 Company A Margetts, Rob 14-May-04
> 03-Jun-04
> 5 Company A Margetts, Rob 06-Aug-04
> 26-Aug-04
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/