Dear Scott,
Thanks for the help. Your idea is great. I just need to fix it up, as
my data has identifiers with 2 or more digits as well (i.e. 11, 11A,
11B), in which case keeping only the first digit is not enough. But I
know how to this.
Thanks again,
Radu
2006/6/29, Scott Merryman <[email protected]>:
I believe the example below, which merges only the first character of the id
variable works without two successive merges.
Scott
clear
tempfile tmp1
input str2 id
1A
1B
2
3
4A
4B
5A
5C
5B
6A
6B
7
end
sort id
gen id_num = substr(id, 1,1)
sort id_num
save `tmp1'
clear
input str2 id2
1
2
3
4B
4A
5C
5A
5B
6
7A
7B
end
sort id2
gen id_num = substr(id2, 1,1)
sort id_num
merge id_num using `tmp1'
drop _m id_num
sort id2 id
l
> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of Radu Ban
> Sent: Thursday, June 29, 2006 4:22 PM
> To: [email protected]
> Subject: st: merge datasets using "closest" match
>
> dear listers,
>
> i have two datasets and i want to match them on a key variable. the
> problem is that the key variable differs slightly between the two
> datasets. i'll explain what this means.
>
> in dataset 1 the key may look like this
> 1
> 2
> 3
> 4A
> 4B
> 5A
> 5B
> 5C
> 6
> ...
>
> in dataset 2 the key may look like this
> 1A
> 1B
> 2
> 3
> 4A
> 4B
> 5A
> 5B
> 5C
> 6A
> 6B
> ...
>
> the reason for these discrepancies is that, the unit of of observation
> is a plot (of land) and some plots have split (for example 1 has split
> into 1A and 1B, 5 has split into 5A and 5B, etc) between the two
> periods of time. i want to merge the two datasets keeping in mind
> these potential splits, so that 1A and 1B are both matched to 1.
>
> i figured a long way to do this: generating a "de-lettered" identifier
> in dataset two. then doing two succesive merges. sth like:
>
> merge key using dataset1
> drop if _m == 2
> drop _m
>
> rename key letteredkey
> rename deletteredkey key
> sort key
> merge key using dataset1, update
> drop if _m == 2
>
> is there a shorter, perhaps more clever way to do this? i found a
> user-written ado -nearmrg-, which does exactly what i want but only
> for numeric keys.
>
> thanks a lot for this,
> radu ban
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/