Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Merge issues - m:m not returning all matches
From
Aaron Legler <[email protected]>
To
[email protected]
Subject
Re: st: Merge issues - m:m not returning all matches
Date
Fri, 20 Jan 2012 10:49:03 -0500
Scott,
Thanks - Nick pointed me to Joinby and it worked.
I always thought merge m:m would also form all pairwise combinations -
my own misinterpretation of the command.
Aaron
On Fri, Jan 20, 2012 at 10:41 AM, Scott Merryman
<[email protected]> wrote:
> For example:
>
> clear*
> set obs 2
> gen id = 2500
> gen patiennum = 10
> gen date = _n
> save id,replace
> clear
> set obs 16
> gen id = 2500
> gen dist = runiform()
> save tract,replace
>
> use id
> merge m:m id using tract
> count
> use id,clear
> joinby id using tract
> count
>
> Scott
>
>
>
> On Fri, Jan 20, 2012 at 9:24 AM, Aaron Legler <[email protected]> wrote:
>> I am having an issue with merge -
>>
>> I have one dataset with patient_id and censustract, and another file with
>> censustract and distance to 16 locations
>>
>> When I perform the merge I am not getting all the possible matches:
>>
>> This is the original patient with 2 records
>>
>> patiennum geoid svc_date
>> 12345 25009205500 01 Aug 09
>> 12345 25009205500 05 Sep 10
>>
>> after the merge: merge m:m geoid using chc.censustract.dist.dta
>>
>> I should get 32 records (2 patient records x 16 locatons) but I'm only
>> getting 16:
>>
>> patien~m geoid svc_date km_to_~c hosp _merge
>> 12345 25009205500 01 Aug 09 13.701 2 matched (3)
>> 12345 25009205500 05 Sep 10 15.144 1 matched (3)
>> 12345 25009205500 05 Sep 10 15.144 5 matched (3)
>> 12345 25009205500 05 Sep 10 15.144 13 matched (3)
>> 12345 25009205500 05 Sep 10 15.144 14 matched (3)
>> 12345 25009205500 05 Sep 10 19.156 12 matched (3)
>> 12345 25009205500 05 Sep 10 19.156 16 matched (3)
>> 12345 25009205500 05 Sep 10 20.407 3 matched (3)
>> 12345 25009205500 05 Sep 10 20.407 4 matched (3)
>> 12345 25009205500 05 Sep 10 20.407 6 matched (3)
>> 12345 25009205500 05 Sep 10 20.407 8 matched (3)
>> 12345 25009205500 05 Sep 10 20.407 11 matched (3)
>> 12345 25009205500 05 Sep 10 20.407 15 matched (3)
>> 12345 25009205500 05 Sep 10 25.031 9 matched (3)
>> 12345 25009205500 05 Sep 10 25.038 7 matched (3)
>> 12345 25009205500 05 Sep 10 25.583 10 matched (3)
>>
>> It seems like the system isn't recognizing the differences in svc_date and
>> just running 1 match.
>>
>> I checked to ensure the geoids are the same:
>>
>> . tab geoid
>> geoid | Freq. Percent Cum.
>> ------------+-----------------------------------
>> 2.50e+10 | 16 100.00 100.00
>> ------------+-----------------------------------
>> Total | 16 100.00
>> Any suggestions would be very appreciated. thanks.
>>
>> Aaron Legler
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/