Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Data management on multiple rows per subject
From
Thomas Speidel <[email protected]>
To
<[email protected]>
Subject
st: Data management on multiple rows per subject
Date
Tue, 27 Aug 2013 14:33:00 -0600
I have a data managment problem. This is a sample of the data, having
multiple rows per subject:
+-----------------------------------------------------------------------+
| id start end
event |
|-----------------------------------------------------------------------|
| 9 10jan2011 00:00:00 10jan2011 21:29:59
A |
|-----------------------------------------------------------------------|
| 10 19dec2010 00:00:00 19dec2010 19:59:59
A |
|-----------------------------------------------------------------------|
| 11 23jan2011 08:15:00 24jan2011 18:00:00
A |
| 11 24jan2011 10:14:59 24jan2011 13:45:00 B |
| 11 26jan2011 06:00:00 26jan2011 07:00:00 B
|
| 11 26jan2011 07:30:00 26jan2011 18:00:00 B |
|-----------------------------------------------------------------------|
| 12 17dec2010 02:44:59 18dec2010 01:30:00
A |
+-----------------------------------------------------------------------+
Within id, I need to drop the B rows when their date is not contained
in A.
So, in the example above, this would be the result:
+-----------------------------------------------------------------------+
| id start end
event |
|-----------------------------------------------------------------------|
| 9 10jan2011 00:00:00 10jan2011 21:29:59
A |
|-----------------------------------------------------------------------|
| 10 19dec2010 00:00:00 19dec2010 19:59:59
A |
|-----------------------------------------------------------------------|
| 11 23jan2011 08:15:00 24jan2011 18:00:00
A |
| 11 24jan2011 10:14:59 24jan2011 13:45:00 B |
|-----------------------------------------------------------------------|
| 12 17dec2010 02:44:59 18dec2010 01:30:00
A |
+-----------------------------------------------------------------------+
I know how to solve this using reshape, but the data is too complex to
handle comfortably in reshape (too many rows per subjects in some
instances).
I thought of subscripting, but did not get far. Within subject, date
ranges are either fully contained in A or they are not.
Thank you.
--
Thomas Speidel
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/