Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: identifying re-operations from a list of operation codes and dates
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: identifying re-operations from a list of operation codes and dates
Date
Fri, 6 Jul 2012 22:52:30 +0100
There is absolutely no problem with combining -sort- and -by:-. Indeed
that is utterly routine. But that would only be needed to do something
else. -sort-ing by itself just requires a varlist.
Here is a sandpit to play in
sysuse auto, clear
edit for rep78
bysort foreign (rep78) : gen id = _n
bysort foreign rep78 : gen id2 = _n
edit for rep78 id id2
sort rep78 foreign
edit rep78 foreign id id2
At a guess you should try
bysort id op_code (op_date) : gen reop = (_n > 1) & (op_date - op_date[1]) > 1
There is a tutorial at
SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q1/02 SJ 2(1):86--102 (no commands)
explains the use of the by varlist : construct to tackle
a variety of problems with group structure, ranging from
simple calculations for each of several groups to more
advanced manipulations that use the built-in _n and _N
which is accessible at the Stata Journal website.
On Fri, Jul 6, 2012 at 10:38 PM, Stephen Martin
<[email protected]> wrote:
> Thanks for this Nick.
>
> First a couple of clarifications:
>
> a) yes, oper_7 corresponds to operdate_7
> and
> b) each oper_ variable contains a single four digit operation code.
>
> Having reshaped as you suggested I thought I saw where you were
> directing me but I tried to sort on op_date by id (to get each
> pa7tient's operation codes in chronological order) but sort does not
> appear to allow by (am I misunderstanding something here?).
>
> To illustrate the data set, here is the reshaped data for the first
> patient. She has four operation codes and the op_date is an elapsed
> date. The position variable can run from 1 to 24 but I have dropped
> values 5 - 24 as these are empty for both op_code and op_date.
>
> id position op_code op_date
> 1 1 A123 18000
> 1 2 C567 18000
> 1 3 X678 18010
> 1 4 B679 17996
>
> Any further guidance would be most welcome.
>
> Steve
>
>
> On 06/07/2012, Nick Cox <[email protected]> wrote:
>> Paradoxically, or otherwise, there is a lot of detail to absorb here,
>> yet you may be suppressing part of the story to keep it as simple as
>> possible. We lose either way.
>>
>> Assuming that e.g. oper_7 corresponds to operdate_7 then all is not
>> lost. But I would first
>>
>> reshape long oper_ operrate_ , i(id)
>>
>> and then clean up by renaming, dropping missing, sorting on date.
>>
>> But although you named these variables, I fear there are others. (In
>> which variables are the re-operation codes?)
>>
>> I fear that's only a start and you may need to report back.
>>
>> Nick
>>
>> On Fri, Jul 6, 2012 at 3:22 PM, Stephen Martin
>> <[email protected]> wrote:
>>
>>> I have a dataset for patients admitted to hospital. Each record
>>> includes:
>>>
>>> a) a patient identifer;
>>> b) 24 four digit operation procedure variables (oper_1 - oper_24;
>>> these are four digit strings such as A148, C169, etc); and
>>> c) 24 date of operation variables (operdate_1 - operdate24).
>>>
>>> Many of the operation procedure code and date variables are empty.
>>> The operation procedure code variables are not necessarily in date
>>> order.
>>>
>>> I have a list (list A) of, say, 20 four digit operation codes that can
>>> identify whether a patient received an operation for the condition in
>>> which I am interested.
>>>
>>> I also have a list (list B) of, say, 35 re-operation codes.
>>>
>>> I would like to identify those patients who had both the operation and
>>> the re-operation.
>>>
>>> However, I cannot solely use lists A and B because some of the same
>>> codes appear in both lists, and a re-operation must occur at least one
>>> day later than the initial operation.
>>>
>>> Thus I would like to identify patients who:
>>> (a) have an operation code from list A
>>> and
>>> (b) have an re-operation code from list B
>>> and
>>> (c) where the date of the re-operation is later than the initial
>>> operation.
>>>
>>> Suggestions on how to do this would be very welcome!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/