Title | Efficiently defining group characteristics to create subsets | |
Author | Christopher F. Baum, Boston College |
Say that your cross-sectional dataset contains microdata—a record for each employee, for instance—and you want to associate each employee's workplace with an industry code. That information is not on the record but is available to you. How do you get this associated information (which might also be, e.g., the code for a specific pension plan or the state) on the record without manual editing or a long sequence of statements with if clauses? The latter method is perhaps familiar to users of other statistical packages, but there is a better way.
Let us presume that we have Stata dataset employee containing the individual-specific measurements as well as wpid, the workplace ID. Assume that it can be dealt with as an integer; if it were a string code, that could easily be handled as well.
Create a text file containing two columns: the workplace ID (wpid) and the industry code (indcod). For instance,
12367 321 12467 313 13211 321 ... ... 23435 371 32156 341
Read the file into Stata with infile wpid indcod, sort wpid, and save as Stata dataset wpchar.
Now use the employee file and give the commands
. sort wpid . merge m:1 wpid using wpchar . tab _merge
You should find that all employees now have an indcod variable defined. If there are missing values in indcod, list the wpids for which indcod is missing (presuming that you have industry codes for all workplaces). When you are satisfied that the merge has worked properly, type
. drop _merge
This is a good example of the power and flexibility of Stata’s merge command. The merge facility does not perform just one-to-one merges; in this example, it performs a one-to-many merge, associating a workplace with each of the employees at that workplace. A clear advantage of this technique appears when you have more than one characteristic to be added to each employee record, for instance, an industry code and the number of employees of the firm, the total sales of the firm, etc. Any number of such firm-level variables could be added to the records in the wpchar file and merged onto the employee file with the same command.
Unlike an approach depending on a long list of conditional statements, replace indcod=321 if inlist(wpid,12367,13211,...), this approach provides a Stata dataset containing your workplace ID numbers, so that you may easily see whether you have a particular code in your list. This approach would be especially useful if you revise the list for a new set of workplaces, etc.
Learn
Free webinars
NetCourses
Classroom and web training
Organizational training
Video tutorials
Third-party courses
Web resources
Teaching with Stata
© Copyright 1996–2024 StataCorp LLC. All rights reserved.
×
We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.
Cookie Settings
Last updated: 16 November 2022
StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.
These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.
Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.