Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: Request code simplification
From
Rebecca Pope <[email protected]>
To
[email protected]
Subject
Re: st: Re: Request code simplification
Date
Wed, 12 Jun 2013 09:26:38 -0500
Mike,
First things first, a point of order: -carryforward- is a user-written
command and you are asked to please identify these when you use them
and note from where you have obtained them.
Second, I tested your (slightly modified) code on my computer using a
test set built off of what you posted with a total of 270,000
observations (you just said "thousands", so I made it of moderate
size). Total run time was 0.79 seconds. So, if you are having computer
problems, my suggestion would be to check what else is running on your
computer.
Finally, this may be a stupid question, but if you are trying to find
patients with transplants, why aren't you searching for type_of_visit
= (code for transplant). Does the other stuff you posted serve some
greater purpose you didn't mention?
There are multiple ways of handling the code for searching for
transplants. The easiest, I think, would simply be:
*** begin example ***
gen transplant = -1*strmatch(type_of_visit,"*transplant")
bys pat_id (transplant): gen countthis = (-1*transplant) if _n==1
tab countthis
*** end example ***
The code above ran in 0.22 seconds. Timing is from a machine with 4 GB
of RAM and an Intel Core i5-2400 3.1 GHz processor. Performance on
your laptop will likely differ, but I see no reason why either
approach would cause a crash.
Regards,
Rebecca
On Tue, Jun 11, 2013 at 4:14 PM, Michael Stewart
<[email protected]> wrote:
> Hello,
>
> I am working on a data set with thousands of patients( and multiple
> records per patient) and I am trying to identify pateints with any
> transplant(could undergo liver, kidney, intestine etc) . I was
> wondering if we could simplify my code as my laptop is freezing with
> the following set of commands.
>
>
> Here the goal is find patients who has
> type_of_visit[1]=="first_clinic_visit" & any other type_of_visit is
> "".(after the records are sorted by pat_id and visit_date)
>
> My code :
>
> bysort PAT_ID (visit_date):carryforward VISIT,gen(v)
> bysort PAT_ID (visit_date):gen x=v[1]==v[_n]
> bysort PAT_ID (visit_date):egen z=sum(x)
> bysort PAT_ID (visit_date):egen zz=count(x)
> gen y= z< zz
>
>
> The dataset format is as follows.
>
> pat-id visit_date type_of_visit
> -------------------------------------------------------
> xxx 09/01/2003 first_clinic_visit
> xxx 09/15/2003
> .
> .
> .
> .
> XXX 12/12/2003
> XXX 2/04/2004 liver_transplant
> yyy 01/01/2004 first_clinic_visit
> yyy 02/02/2005
> .
> .
> yyy 01/03/2008 intestine_transplant
> zzz 05/01/2010 first_clinic_visit
> zzz 05/03/2011
> .
> .
> .
> .
> ------------------------------------------------------------
>
> As always, thanks a lot for your time
> Thank you ,
> Yours Sincerely,
> Mike.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/