Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Ben Hoen" <bhoen@lbl.gov> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: Pulling in files and data stored in a folder tree |
Date | Fri, 27 Jul 2012 11:26:31 -0400 |
Hi Statalisters, I have a set of ~ 200,000 records stored in one dataset (?master file?) each of which has a year and a county to which it applies, and a unique record id. Separately I have a large set of files that are stored by county (of which there are 20, so there are 20 county folders) and year (for each county there are 10 year folders ? 2002 through 2011). In each year folder, there are 4 files that I want to pull data from (via 1:1 merge with the ?master file? using the record id). There are roughly 10 variables I want to add to the master file from these 4 files, or approximately 2 to 3 from each file. So, the question is how I might write code that will go through each record in the master file, determine the year and the county, go through the folder tree to find the appropriate year in the appropriate county, and then merge with the four files ?keeping? the data from the 10 variables? A few things to note: 1) the files I want to pull data from are column separated text files (i.e., I have not gone through the trouble of converting then to Stata files yet ? but could?); and, 2) all of the files from which I want to pull data are named by county and year (e.g., <countyname>_<year>_<filename>) and these names match exactly with the county names and years stored in the master file. I suspect many have done this type of thing before, so if anyone has some reading that they could send me to, I would be very appreciative. Thanks, in advance, Ben Ben Hoen Principal Research Associate Lawrence Berkeley National Laboratory Office: 845-758-1896 Cell: 718-812-7589 bhoen@lbl.gov http://eetd.lbl.gov/ea/emp/staff/hoen.html Visit our publications at: http://eetd.lbl.gov/ea/ems/emp-pubs.html Sign up for our email list to receive publication notifications: http://eetd.lbl.gov/ea/emp/list/emp_pubs_signup.php * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/