Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Amanda Fu <mandy.fu1@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | st:how to keep the overlapping variables as many as possible when combining data sets |
Date | Thu, 4 Nov 2010 02:42:52 -0400 |
Dear Statalisters, I was trying to use "merge" to combine the following two data sets: data set 1: (master data set) ------------------------ id v1 v2 1 12 b 2 45 5 3 4 111 r 5 144 c ----------------------- data set 2: (using data set) , where v1 and v2 are overlapped with comparison to data set 1. ----------------------- id v1 v2 v3 1 12 m 8 2 45 8 3 78 0 8 4 111 2 8 5 3 6 8 7 ---------------------------- The following data set is what I want to see after the combination. That is: When there are overlapping variables, if the values in the master data set is available, the one in the master data set is used. On the condition that the value in the master data set is missing, use the one in the using data set if it is available. ----------------------------goal data set id v1 v2 v3 1 12 b 8 2 45 5 8 3 78 0 8 4 111 r 8 5 144 c 6 8 7 ---------------------------- But when I tried to use ---merge id using "2"----, the result is that for the overlapping variables only the values in the master data set are kept as following: ----------------------------merged data set id v1 v2 v3 1 12 b 8 2 45 5 8 3 8 4 111 r 8 5 144 c 6 8 7 ---------------------------- May I know how I can get the goal data set using merge? Thanks for your time! Best wishes, Amanda Fu * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/