Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Anders Alexandersson <andersalex@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: AW: combining datasets |
Date | Thu, 19 Aug 2010 11:55:32 -0400 |
Martine, Also see [U] 22 Combining datasets. Maarten provided an excellent append solution with this being the main line: . append using `a' Here is the equivalent merge solution: . merge 1:1 source id using `a', nogen The choice between append and merge is more important for large datasets because you need the right variable naming scheme. Michael Mitchell gave a good tip in his data management book described at http://www.stata.com/bookstore/dmus.html : If you will append datasets, you want the variable names to be the same, but if you will merge datasets, you want the variable names to be different. Anders Alexandersson andersalex@gmail.com On Thu, Aug 19, 2010 at 4:34 AM, Maarten buis <maartenbuis@yahoo.co.uk> wrote: > --- On Wed, 18/8/10, martine etienne wrote: >> firstly, person 1 in dataset A is NOT same person as person >> 1 in dataset B, measurements are also taken at different times >> secondly, I would like the final dataset to look like Final 1 > > Here is an example of how to do that: > > *------------ begin example ------------ > // create the two datasets > tempfile a b > > drop _all > input id x > 1 3 > 2 4 > end > save `a' > > drop _all > input id x > 1 5 > 2 6 > end > save `b' > > // create a new variable in each dataset > // that identifies the source of those > // observations > use `a' > gen source = "a" > > save `a', replace > > use `b' > gen source = "b" > save `b', replace > > // use -append- to stack the datasets > append using `a' > > // create a extra id variable, which contains > // an unique integer for each source-id combination > // and attaches the values of the source and id > // variables to the value label > egen long new_id = group(source id), label > > // for display purposes I put the thre id variables > // to the left of the dataset > order id source new_id > > // display the result > list > *--------------- end example ---------------- > (For more on examples I sent to the Statalist see: > http://www.maartenbuis.nl/example_faq ) > > Hope this helps, > Maarten > > -------------------------- > Maarten L. Buis > Institut fuer Soziologie > Universitaet Tuebingen > Wilhelmstrasse 36 > 72074 Tuebingen > Germany > > http://www.maartenbuis.nl > -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/