Mi mastaer database is a household survey, I have about 245 variables
but I have to add population proyections according to agegp, dtpo and
e1. All the observatios with the same agegp, dpto and e1 has the same
proyection. For example, I have four observations where
dpto==agegp==e1==1 but it differ in other variables, anyway all have
the same proyections.
I have 10 men. Half are 50 or more years old and the other half are
younger. One quarter is from dpto 1, two quarter from dpto 2 and one
quarter for dpto 3. But one has 2 children, two has 3, 5 have 10
children, etc. But population proyection is the same for 50+ year old
men from dpto 1.
Dpto, agegp and e1 is a unique set of key variables in using dataset.
I found a mistake in my a.dta using dataset. I should be:
agegp dpto e1 proyeccion
0 1 1 9261
1 1 1 36894
5 1 1 47986
1 1 2 8863
5 1 2 35504
0 2 2 46194
1 2 1 47042
5 2 1 10401
1 2 2 1543
5 2 2 519
b.dta:
agegp dpto e1 vaca0 vaca1
0 1 1 9261 6174
1 1 1 36894 24596
5 1 1 47986 31990
0 1 1 48976 32650
1 1 2 8863 5908
5 1 2 35504 23669
0 2 2 2133 455
1 2 1 2212 48971
5 2 1 108 1170
0 2 2 20 4304
1 2 2 238 566
5 2 2 2 72
0 1 1 9261 61
1 1 1 36894 245
5 1 1 47986 31
0 1 1 48976 3
1 1 2 8863 590
5 1 2 35504 236
0 2 2 213 455
1 2 1 221 48
5 2 1 108 11
0 2 2 2074 43
1 2 2 238 5
5 2 2 269 72
So when I merge:
agegp dpto e1 vaca1 vaca2 proyeccion
0 1 1 48976 32650 9261
0 1 1 9261 61 9261
0 1 1 48976 3 9261
0 1 1 9261 6174 9261
0 2 2 20 4304 46194
0 2 2 213 455 46194
0 2 2 2133 455 46194
0 2 2 2074 43 46194
1 1 1 36894 245 36894
1 1 1 36894 24596 36894
1 1 2 8863 5908 8863
1 1 2 8863 590 8863
1 2 1 221 48 47042
1 2 1 2212 48971 47042
1 2 2 238 5 1543
1 2 2 238 566 1543
5 1 1 47986 31990 47986
5 1 1 47986 31 47986
5 1 2 35504 23669 35504
5 1 2 35504 236 35504
5 2 1 108 1170 10401
5 2 1 108 11 10401
5 2 2 269 72 519
5 2 2 2 72 519
So is a many-to-one merge.
Bye,
Sebastian.
2007/7/17, Michael Blasnik <[email protected]>:
...
I don't understand how this merge is supposed to work -- it looks like a
"many-to-many" merge because there is no unique set of key variables in either
dataset. I thought the proyeccion was supposed to be some population info, but
then why does it have two entries for the values 0 2 2 for agegp dpto e1?
You need to come up with a way for Stata to determine which observations go
together or you will end up with multiple matches.
Michael Blasnik
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/