Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Stephen Martin <stephen.martin@york.ac.uk> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: identifying re-operations from a list of operation codes and dates |
Date | Mon, 9 Jul 2012 15:27:27 +0100 |
Thanks Nick. Steve On 06/07/2012, Nick Cox <njcoxstata@gmail.com> wrote: > There is absolutely no problem with combining -sort- and -by:-. Indeed > that is utterly routine. But that would only be needed to do something > else. -sort-ing by itself just requires a varlist. > > Here is a sandpit to play in > > sysuse auto, clear > edit for rep78 > bysort foreign (rep78) : gen id = _n > bysort foreign rep78 : gen id2 = _n > edit for rep78 id id2 > sort rep78 foreign > edit rep78 foreign id id2 > > At a guess you should try > > bysort id op_code (op_date) : gen reop = (_n > 1) & (op_date - op_date[1]) > > 1 > > There is a tutorial at > > SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: > step > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. > Cox > Q1/02 SJ 2(1):86--102 (no > commands) > explains the use of the by varlist : construct to tackle > a variety of problems with group structure, ranging from > simple calculations for each of several groups to more > advanced manipulations that use the built-in _n and _N > > which is accessible at the Stata Journal website. > > > On Fri, Jul 6, 2012 at 10:38 PM, Stephen Martin > <stephen.martin@york.ac.uk> wrote: >> Thanks for this Nick. >> >> First a couple of clarifications: >> >> a) yes, oper_7 corresponds to operdate_7 >> and >> b) each oper_ variable contains a single four digit operation code. >> >> Having reshaped as you suggested I thought I saw where you were >> directing me but I tried to sort on op_date by id (to get each >> pa7tient's operation codes in chronological order) but sort does not >> appear to allow by (am I misunderstanding something here?). >> >> To illustrate the data set, here is the reshaped data for the first >> patient. She has four operation codes and the op_date is an elapsed >> date. The position variable can run from 1 to 24 but I have dropped >> values 5 - 24 as these are empty for both op_code and op_date. >> >> id position op_code op_date >> 1 1 A123 18000 >> 1 2 C567 18000 >> 1 3 X678 18010 >> 1 4 B679 17996 >> >> Any further guidance would be most welcome. >> >> Steve >> >> >> On 06/07/2012, Nick Cox <njcoxstata@gmail.com> wrote: >>> Paradoxically, or otherwise, there is a lot of detail to absorb here, >>> yet you may be suppressing part of the story to keep it as simple as >>> possible. We lose either way. >>> >>> Assuming that e.g. oper_7 corresponds to operdate_7 then all is not >>> lost. But I would first >>> >>> reshape long oper_ operrate_ , i(id) >>> >>> and then clean up by renaming, dropping missing, sorting on date. >>> >>> But although you named these variables, I fear there are others. (In >>> which variables are the re-operation codes?) >>> >>> I fear that's only a start and you may need to report back. >>> >>> Nick >>> >>> On Fri, Jul 6, 2012 at 3:22 PM, Stephen Martin >>> <stephen.martin@york.ac.uk> wrote: >>> >>>> I have a dataset for patients admitted to hospital. Each record >>>> includes: >>>> >>>> a) a patient identifer; >>>> b) 24 four digit operation procedure variables (oper_1 - oper_24; >>>> these are four digit strings such as A148, C169, etc); and >>>> c) 24 date of operation variables (operdate_1 - operdate24). >>>> >>>> Many of the operation procedure code and date variables are empty. >>>> The operation procedure code variables are not necessarily in date >>>> order. >>>> >>>> I have a list (list A) of, say, 20 four digit operation codes that can >>>> identify whether a patient received an operation for the condition in >>>> which I am interested. >>>> >>>> I also have a list (list B) of, say, 35 re-operation codes. >>>> >>>> I would like to identify those patients who had both the operation and >>>> the re-operation. >>>> >>>> However, I cannot solely use lists A and B because some of the same >>>> codes appear in both lists, and a re-operation must occur at least one >>>> day later than the initial operation. >>>> >>>> Thus I would like to identify patients who: >>>> (a) have an operation code from list A >>>> and >>>> (b) have an re-operation code from list B >>>> and >>>> (c) where the date of the re-operation is later than the initial >>>> operation. >>>> >>>> Suggestions on how to do this would be very welcome! > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/