Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Data management on multiple rows per subject
From
Joe Canner <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: RE: Data management on multiple rows per subject
Date
Tue, 27 Aug 2013 20:46:11 +0000
. bys id (event): gen Astart=start[1]
. bys id (event): gen Aend=end[1]
. drop if event=="B" & (!inrange(start,Astart,Aend) | !inrange(end,Astart,Aend))
This can probably be simplified if your data are more predictable (as it sounds like they might be), but you get the idea.
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Thomas Speidel
Sent: Tuesday, August 27, 2013 4:33 PM
To: [email protected]
Subject: st: Data management on multiple rows per subject
I have a data managment problem. This is a sample of the data, having
multiple rows per subject:
+-----------------------------------------------------------------------+
| id start end
event |
|-----------------------------------------------------------------------|
| 9 10jan2011 00:00:00 10jan2011 21:29:59
A |
|-----------------------------------------------------------------------|
| 10 19dec2010 00:00:00 19dec2010 19:59:59
A |
|-----------------------------------------------------------------------|
| 11 23jan2011 08:15:00 24jan2011 18:00:00
A |
| 11 24jan2011 10:14:59 24jan2011 13:45:00 B |
| 11 26jan2011 06:00:00 26jan2011 07:00:00 B
|
| 11 26jan2011 07:30:00 26jan2011 18:00:00 B |
|-----------------------------------------------------------------------|
| 12 17dec2010 02:44:59 18dec2010 01:30:00
A |
+-----------------------------------------------------------------------+
Within id, I need to drop the B rows when their date is not contained
in A.
So, in the example above, this would be the result:
+-----------------------------------------------------------------------+
| id start end
event |
|-----------------------------------------------------------------------|
| 9 10jan2011 00:00:00 10jan2011 21:29:59
A |
|-----------------------------------------------------------------------|
| 10 19dec2010 00:00:00 19dec2010 19:59:59
A |
|-----------------------------------------------------------------------|
| 11 23jan2011 08:15:00 24jan2011 18:00:00
A |
| 11 24jan2011 10:14:59 24jan2011 13:45:00 B |
|-----------------------------------------------------------------------|
| 12 17dec2010 02:44:59 18dec2010 01:30:00
A |
+-----------------------------------------------------------------------+
I know how to solve this using reshape, but the data is too complex to
handle comfortably in reshape (too many rows per subjects in some
instances).
I thought of subscripting, but did not get far. Within subject, date
ranges are either fully contained in A or they are not.
Thank you.
--
Thomas Speidel
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/