Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Merge issues - m:m not returning all matches
From
Aaron Legler <[email protected]>
To
[email protected]
Subject
Re: st: RE: Merge issues - m:m not returning all matches
Date
Fri, 20 Jan 2012 10:46:11 -0500
Joinby works great - thanks Nick.
Aaron
On Fri, Jan 20, 2012 at 10:40 AM, Nick Cox <[email protected]> wrote:
> Also, your problem sounds more like one for -joinby-.
>
> Nick
> [email protected]
>
>
> -----Original Message-----
> From: Nick Cox
> Sent: 20 January 2012 15:36
> To: '[email protected]'
> Subject: RE: Merge issues - m:m not returning all matches
>
> On m:m merges: see the thread last week starting with
>
> http://www.stata.com/statalist/archive/2012-01/msg00370.html
>
> However, please ignore my post in that thread: it missed the point, which is well explained by others.
>
> Nick
> [email protected]
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Aaron Legler
> Sent: 20 January 2012 15:25
> To: [email protected]
> Subject: st: Merge issues - m:m not returning all matches
>
> I am having an issue with merge -
>
> I have one dataset with patient_id and censustract, and another file with
> censustract and distance to 16 locations
>
> When I perform the merge I am not getting all the possible matches:
>
> This is the original patient with 2 records
>
> patiennum geoid svc_date
> 12345 25009205500 01 Aug 09
> 12345 25009205500 05 Sep 10
>
> after the merge: merge m:m geoid using chc.censustract.dist.dta
>
> I should get 32 records (2 patient records x 16 locatons) but I'm only
> getting 16:
>
> patien~m geoid svc_date km_to_~c hosp _merge
> 12345 25009205500 01 Aug 09 13.701 2 matched (3)
> 12345 25009205500 05 Sep 10 15.144 1 matched (3)
> 12345 25009205500 05 Sep 10 15.144 5 matched (3)
> 12345 25009205500 05 Sep 10 15.144 13 matched (3)
> 12345 25009205500 05 Sep 10 15.144 14 matched (3)
> 12345 25009205500 05 Sep 10 19.156 12 matched (3)
> 12345 25009205500 05 Sep 10 19.156 16 matched (3)
> 12345 25009205500 05 Sep 10 20.407 3 matched (3)
> 12345 25009205500 05 Sep 10 20.407 4 matched (3)
> 12345 25009205500 05 Sep 10 20.407 6 matched (3)
> 12345 25009205500 05 Sep 10 20.407 8 matched (3)
> 12345 25009205500 05 Sep 10 20.407 11 matched (3)
> 12345 25009205500 05 Sep 10 20.407 15 matched (3)
> 12345 25009205500 05 Sep 10 25.031 9 matched (3)
> 12345 25009205500 05 Sep 10 25.038 7 matched (3)
> 12345 25009205500 05 Sep 10 25.583 10 matched (3)
>
> It seems like the system isn't recognizing the differences in svc_date and
> just running 1 match.
>
> I checked to ensure the geoids are the same:
>
> . tab geoid
> geoid | Freq. Percent Cum.
> ------------+-----------------------------------
> 2.50e+10 | 16 100.00 100.00
> ------------+-----------------------------------
> Total | 16 100.00
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/