Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Andreanne Tremblay Simard <atremblay10@schulich.yorku.ca> |
To | statalist <statalist@hsphsun2.harvard.edu> |
Subject | st: Preparing the data for merge - No unique identifier and problems with regexs |
Date | Wed, 18 May 2011 15:34:38 -0400 |
Dear Stata Users, I have one dataset describing companies, with the firm name (fname) as an identifier. I have another dataset describing mutual funds owned by the firms from the first dataset, with the variable fund_name as an identifier. I eventually want to merge the two datasets (one-to-many). However, the fund names (fund_name) and firms' names (fname) are not the same, although they generally have a common part. For example, here are two firms' names (fname) and funds operated by these firms (fund_name) fname Alpha Capital 3g Top Capital fund_name Alpha Growth 3g Top Capital Fund I 3g Income How can I match these observations, so that Alpha Capital goes with Alpha Growth, and 3g Top Capital goes with both 3g Top Capital Fund I and 3g Income? That is, I want to get the following: fname fund_name Alpha Capital Alpha Growth 3g Top Capital 3g Top Capital Fund I 3g Top Capital 3g income My original idea was to use regexs to extract the first part of the firms' and funds' names, and then merge using the extracted first part. However, I don't seem to be doing this right, since regexs seems to truncate the names from the end... but since the observations have a varying number of substrings (separated by spaces), I can't use the, say, second-to-last string, since sometimes there is no second-to-last because there is only one substring! Thank you for your input, and for helping me learn more about Stata Best regards, Andréanne Tremblay Disclaimer: This email and any files transmitted with it are private and confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the addressee, you are not authorized to copy or use the information or to place any reliance upon it, nor should you copy it or show it to anyone. If you have received this email in error, please notify postmaster@schulich.yorku.ca Schulich School of Business, York University * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/