Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Kantor <kantor.d@att.net> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: tricky data merge/joinby problem |
Date | Fri, 04 Mar 2011 11:28:04 -0500 |
Dimitry,I still think that an m:m merge yields meaningless pairings. In you example, for bgid 2,
bid bgid fracpop 21 2 .3 22 2 .2 23 2 .5 Assuming that you have, in the second file, bgid dateyq bgpop 2 2010q1 whatever 2 2010q2 whatever 2 2010q3 whatever 2 2010q4 whateverThe first case (bid 21) would pair with 2010q1; the second (bid 22) with 2010q2; the third (bid 23) would be replicated and paired with 2010q3 and 2010q4.
I'm not sure that this is meaningful.But now that I understand your expand-to-a-panel scheme, it does look correct. And it makes sense that it would be faster than -joinby-.
Best wishes, --David At 11:13 AM 3/4/2011, you wrote:
David, I wrote m:m merge since each BG usually appears more than once in the first file (since blocks are the ids) and more than once in the second (since it's a block group panel). I checked a few cases with the real data and it seems to have worked. I just wanted to make sure that there was nothing that I was missing and hoping to find a special case that does not produce garbage. By expanding into a panel, I meant stack the file1 on top of itself four times (4 quarters of 2010) and create a dateyq variable. The data would not change over time, but it seemed to make m:1 by date and bgid easier (at least in my head). The reason I wanted to try merge is that is appears to be much faster than joinby, which has been running for a long time on a pretty fast server. DVM
* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/