[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: merging values of one variable to to the nearest value in another dataset (stata v10.1)

From	"Ariel Linden" <[email protected]>
To	<[email protected]>
Subject	st: RE: RE: merging values of one variable to to the nearest value in another dataset (stata v10.1)
Date	Wed, 21 Oct 2009 11:19:18 -0700

Thank you Nick! As always your advice is sound. I found your example and the
faq to be extremely helpful.

I ran through the logic and realized that I would probably stop after the
"sort id event" since at that point I can see when the first fill was
completed. The only other thing I would modify here is to generate a new
variable in which I would move up the fill date to the row of the last
visit.

Thanks again!

Ariel

Date: Tue, 20 Oct 2009 21:55:43 +0100
From: "Nick Cox" <[email protected]>
Subject: st: RE: RE: merging values of one variable to to the nearest value
in another dataset (stata v10.1)

A better way to fill in next visits is to reverse time. See much more
detail at http://www.stata.com/support/faqs/data/missing.html

Nick 
[email protected] 


- -----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: 20 October 2009 21:35
To: [email protected]
Subject: st: RE: merging values of one variable to to the nearest value
in another dataset (stata v10.1)

There are various answers to this question. One is to think in terms of
- -append- rather than -merge-. 

Consider a strategy starting with appended files: 

. l

     +----------------------------+
     | id   dateofv~t   dateoff~l |
     |----------------------------|
  1. |  1    1/1/2009             |
  2. |  1   1/31/2009             |
  3. |  1   2/18/2009             |
  4. |  1   3/14/2009             |
  5. |  1   5/30/2009             |
     |----------------------------|
  6. |  1   7/12/2009             |
  7. |  2   3/10/2009             |
  8. |  2   4/25/2009             |
  9. |  2   5/20/2009             |
 10. |  3   2/09/2009             |
     |----------------------------|
 11. |  3   6/08/2009             |
 12. |  3   7/10/2009             |
 13. |  1               2/22/2009 |
 14. |  2               4/15/2009 |
 15. |  3               7/15/2009 |
     +----------------------------+

We first convert our date variables to numeric dates: 

. gen visit = date(dateofvisit, "MDY")
(3 missing values generated)

. gen fill = date(dateoffill, "MDY")
(12 missing values generated)

Then combine our dates rowwise: 

. gen event = max(visit, fill)

- -- and get data on next and previous visits 

. sort id event

. gen lastvisit = visit
(3 missing values generated)

. by id : replace lastvisit = lastvisit[_n-1] if missing(lastvisit)
(3 real changes made)

. gen nextvisit = visit
(3 missing values generated)

. by id : replace nextvisit = nextvisit[_n+1] if missing(nextvisit)
(2 real changes made)

- -- and so on and so forth. So for each visit you can find the nearest
fill, and vice versa -- and keep track of the separation involved. 

Nick 
[email protected] 

Ariel Linden

I have a dataset in long format which has dates when an individual
visited a
doctor. I have another dataset with dates when that patient started
taking a
medication. Naturally, the dates will not align exactly between the two
data
sets. 

Is there a way of merging the files so that the date when the patient
started taking the medication is aligned (put on the same line) to the
closest date (but after) of a given doctor visit?

Example

File 1 - office visits

ID	date of visit
1	1/1/2009
1	1/31/2009
1	2/18/2009
1	3/14/2009
1	5/30/2009
1	7/12/2009
2	3/10/2009
2	4/25/2009
2	5/20/2009
3	2/09/2009
3	6/08/2009
3	7/10/2009

File 2 - medication fill date

ID 	date of first fill
1	2/22/2009
2	4/15/2009
3	7/15/2009

If I would like the medication date to be added to the office visit file
to
be matched to the last closest office visit, then ID 1 medication date
should align with 1/31/2009, ID 2 should align with 3/10/2009 and ID 3
should align with 7/10/2009. 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: Saving or generate variables for regression coefficients
  - From: Ferdinand Siagian <[email protected]>
- st: Saving or generate variables for regression coefficients
  - From: Ferdinand Siagian <[email protected]>

Prev by Date: Re: st: RE: The not so humble pie
Next by Date: st: Saving or generate variables for regression coefficients
Previous by thread: Re: st: RE: creating expanded records with indicator variables, where different ID's reside in the same rows
Next by thread: st: Saving or generate variables for regression coefficients
Index(es):
- Date
- Thread