Hello!
I have two manufacturing databases that I need to put together. The problem is that each database is classified under a different coding system. I do have the codes to match the observations accordingly but I am not sure of what's the best to do the matching.
Database A contains variables such as total # of employees by industrial sector (v1), total value of shipments by industrial sector (v2), and annual growth rates of the industrial sector (v3). These industrial sectors are according to the SIC87 industry classification, so the database would look like this:
sic87 yr v1 v2 v3
2011 93 124.4 53.3 .0043177
2011 94 119.5 50.7 -.0043294
2011 95 125.8 51.4 -.0102257
2011 96 130 51.6 -.0452671
2013 93 48.7 2.1 .
2013 94 49.6 2.4 .047534
2013 95 48.5 2 .0065023
2014 95 9.6 1.6 .068254
2015 95 8.2 5.3 .0935813
I need to translate all of these database into the ISIC3 industry classification. The problem is that one SIC87 category can go into several ISIC3 categories and also several SIC87 categories can go into only one ISIC3 category.
For instance, suppose that my correspondences are as follows:
sic87 isic3
2011 2020
2011 2022
2011 2026
2013 2100
2014 2100
2015 2100
This means that sic87 category 2011 is now considered 3 separate categories (2020, 2022, and 2026), while all three categories 2013, 2014, and 2015 are now considered only one category 2100.
I want to do the matching in two separate ways:
(a) The first way deals with variables that one can easily add by sector, like the total # of employees by sector (v1) or the value of shipments by sector (v2). In this case, if multiple SIC87 categories are now classified as just one ISIC3 category, we can just add the numbers across categories; if just one SIC87 category is now classified as several ISIC3 categories, we can split the SIC87 number by the number of new ISIC3 categories.
(b) The second one deals with variables that are not possible to just add because the sum would be meaningless. For example, for the case of v3, when multiple SIC87 categories have different growth rates and these categories translate into only one ISIC3 category, we can take the average by sector. On the other hand, if
So, if we look at SIC87 category 2011 for year 95, I want my code to do the following calculations:
isic3 yr v1 v2 v3
2020 95 =125.8/3 =51.4/3 =-.0102257
2022 95 =125.8/3 =51.4/3 =-.0102257
2026 95 =125.8/3 =51.4/3 =-.0102257
while SIC87 categories 2013, 2014, and 2015 for the same year would all fuse into one ISIC3 category to look like this:
isic3 yr v1 v2 v3
2100 95 =48.5+9.6+8.2 =2+1.6+5.3 =(.0065023+.068254+.0935813)/3
Any ideas on how to achieve this?
Thank you.
Adrian
_________________________________________________________________
Talk to your Yahoo! Friends via Windows Live Messenger. Find out how.
http://www.windowslive.com/explore/messenger?ocid=TXT_TAGLM_WL_messenger_yahoo_082008
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/