Sure Nick,
Here is my code:
cd
u "$path\EEEFS II\Fichiers CSB\fs"
destring ident codefs_, replace
mer ident codefs_ using community
ta _m
drop if _m==2
drop _m
cou
so ident codefs_
sa "$path\EEEFS II\Fichiers CSB\fscomm", replace
And I get the results below (whether I destring or not, tostring,
sort stable, etc...).
First attempt:
. cd
C:\Documents and Settings\My Documents\Archives\data\Madagasc
> ar HFS\EEEFS II\Fichiers COMMUNAUTAIRE
. u "$path\EEEFS II\Fichiers CSB\fs"
. destring ident codefs_, replace
ident already numeric; no replace
codefs_ already numeric; no replace
. mer ident codefs_ using community
codefs_ was int now long
ident was int now long
(note: case_id is str9 in using data but will be long now)
(label yn already defined)
. ta _m
_merge | Freq. Percent Cum.
------------+-----------------------------------
1 | 173 12.12 12.12
2 | 1,152 80.73 92.85
3 | 102 7.15 100.00
------------+-----------------------------------
Total | 1,427 100.00
. drop if _m==2
(1152 observations deleted)
. drop _m
. cou
275
. so ident codefs_
. sa "$path\EEEFS II\Fichiers CSB\fscomm", replace
file C:\Documents and Settings\My Documents\Archives\data\Mad
> agascar HFS\\EEEFS II\Fichiers CSB\fscomm.dta saved
Second attempt:
. cd
C:\Documents and Settings\My Documents\Archives\data\Madagasc
> ar HFS\EEEFS II\Fichiers COMMUNAUTAIRE
. u "$path\EEEFS II\Fichiers CSB\fs"
. destring ident codefs_, replace
ident already numeric; no replace
codefs_ already numeric; no replace
. mer ident codefs_ using community
codefs_ was int now long
ident was int now long
(note: case_id is str9 in using data but will be long now)
(label yn already defined)
. ta _m
_merge | Freq. Percent Cum.
------------+-----------------------------------
1 | 176 12.31 12.31
2 | 1,155 80.77 93.08
3 | 99 6.92 100.00
------------+-----------------------------------
Total | 1,430 100.00
. drop if _m==2
(1155 observations deleted)
. drop _m
. cou
275
. so ident codefs_
. sa "$path\EEEFS II\Fichiers CSB\fscomm", replace
file C:\Documents and Settings\My Documents\Archives\data\Mad
> agascar HFS\\EEEFS II\Fichiers CSB\fscomm.dta saved
Third attempt:
. cd
C:\Documents and Settings\My Documents\Archives\data\Madagasc
> ar HFS\EEEFS II\Fichiers COMMUNAUTAIRE
. u "$path\EEEFS II\Fichiers CSB\fs"
. destring ident codefs_, replace
ident already numeric; no replace
codefs_ already numeric; no replace
. mer ident codefs_ using community
codefs_ was int now long
ident was int now long
(note: case_id is str9 in using data but will be long now)
(label yn already defined)
. ta _m
_merge | Freq. Percent Cum.
------------+-----------------------------------
1 | 181 12.61 12.61
2 | 1,160 80.84 93.45
3 | 94 6.55 100.00
------------+-----------------------------------
Total | 1,435 100.00
. drop if _m==2
(1160 observations deleted)
. drop _m
. cou
275
. so ident codefs_
. sa "$path\EEEFS II\Fichiers CSB\fscomm", replace
file C:\Documents and SettingsMy Documents\Archives\data\Mad
> agascar HFS\\EEEFS II\Fichiers CSB\fscomm.dta saved
etc...
Interactivelly:
. use "C:\Documents and Settings\My Documents\Archives\data\M
> adagascar HFS\EEEFS II\Fichiers COMMUNAUTAIRE\community.dta", clear
. so ident codefs_
. mer ident codefs_ using "C:\Documents and Settings\My Docum
> ents\Archives\data\Madagascar HFS\EEEFS II\Fichiers CSB\fs.dta"
(note: case_id is long in using data but will be str9 now)
(label yn already defined)
. ta _m
_merge | Freq. Percent Cum.
------------+-----------------------------------
1 | 1,160 80.84 80.84
2 | 181 12.61 93.45
3 | 94 6.55 100.00
------------+-----------------------------------
Total | 1,435 100.00
(this is stable, both ways. I've made minor changes to my data now I have 94
matches instead of 96).
Actually, I am going interactivelly, but any help would
be the most welcomed.
Best regards.
Amadou.
-------------------------------------------------------------------------------------
Nick wrote:
Without ruling out the possibility that a -merge-
expert can give you useful advice, this still
looks like a guessing game in which guessing is no
fun.
You give us lots of details, but still nothing
concrete about your datasets or your .do file.
A small version in which your problem is evident
is the ideal here.
Naturally, I realise that you are inhibited by
the Statalist rule of not sending attachments,
but there are alternatives:
0. Include a listing of your .do file.
1. Contact tech support at StataCorp.
2. Put the files on a website so that anyone
interested can download.
3. Offer to send the files to volunteer testers
(not me).
Nick
[email protected]
[email protected]
> I am trying to merge 2 datasets.
> But everytime, I get different results
> (_m==3 has 83 observations in the
> first time, 97 in the second, 100 in the
> third and 96 in the fourth, and so on).
> I tried to set seed and made my sort, stable.
> With no success. I also tried to recast double
> my merging identifier. No success. I tried to
> tostring it. No success either.
> Any hints why I obtain these various results?
> I verified in both Stata and Excel.
> I do not understand why Stata marked 3 to some
> observations that belonged to both datasets in the
> first trial and not in the second time.
> Best regards.
> Amadou.
>
> PS: When I work interractivelly, I do not have that problem.
> I have 96 observations that matched. So what I am doing
> wrong in my stata do file?
-------------------------------------------------------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/