Dear "statalist" members,
I am working on two separate data manipulation works, facing some difficulties.
If you could advise me of the way I can process the dataset, it would be highly appreciated.
The first data manipulation work is:
I have 20 datasets (the same dataset over 20 years) with the same variables and structure as follows.
(the first line is variable names and some observations are below)
STATE COUNTY OBJECTS AMOUNTS
01 178 GG 100,000
01 166 SW 200,000
01 778 DL 50,000
03 336 GG 86,000
03 227 SW 33,000
; This is a dataset that explains how the federal government money is distributed to each State and County by Object (such as Salary, Insurance)
OBJECTS has about 9 categories and I have code for OBJECTS (e.g. GG mean federal grants)
I have 20 years dataset, so I will generate “year” variable for each dataset.
The final dataset I want is:
YEAR STATE GG SW DL (and other OBJECT ITEMs below)
1997 01 $1000,000 $2000,000 $340,000
1997 02 $2000,000 $3000,000 $345,000
1998 01 $3000,000 $2400,000 $345,000
1998 02 $5000,000 $3400,000 $367,000
In short, I want to have the dataset that summarizes how much money is distributed based on OBJECTS categories by each STATE; in other words, I have to make each OBJECTS category a “variable” and want to collapse the dataset by STATE.
I don’t have problem with appending the 20 datasets, but I think I need some “foreach” and “local macro” commands that perform the data manipulation work before appending the whole datasets.
Please advice me of the best way I can do.
The Second Work is:
I have two kinds of datasets: one is assessment value of houses and the second is property tax exemptions
Dataset type 1 (House Price Assessment Data)
MUNICODE ID ASMT SCH_CODE
10001000 1 $1000,000 1001
10001000 2 $2000,000 1002
10002020 1 $1000,000 1003
MUNICODE : code for municipality (town of Manlius)
ID : house ID (start from 1 and repeats in each municipality from 1 again)
ASMT : assessment value of house for each ID
SCH_CODE : school district codes (generally school districts are not matching with municipalities;
one municipality can include multiple school districts)
Dataset type 2 (Property Tax Exemption Data)
MUNICODE ID EX_CODE EX_NBR EX_AMT
10001000 1 1200 1 $200,000
10001000 1 2100 2 $300,000
10001000 1 3100 3 $200,000
10001000 2 3100 1 $200,000
10002020 1 1200 1 $100,000
MUNICODE and ID are the same as in the dataset type 1
EX_CODE : exemption code (e.g. 1200 means exemptions for senior owners)
EX_NBR : sequence number of exemptions applied to each ID (multiple exemptions can be applied to one property)
EX_AMT : exemption amount per each exemption code
(In sum, this exemption datasets don’t include school district code but municipality code and IDs so that I can merge the two files by MUNICODE and ID.
But since house owner can have multiple exemptions I have to rearrange this dataset before merging)
The Final Dataset I want
MUNICODE ID ASMT EX_CODE (1200) EX_CODE2(2100) EX_CODE3(3100) SCH_CODE
10001000 1 $1000,000 $200,000 $300,000 $200,000 1001
10001000 2 $2000,000 - - $200,000 1002
10002020 1 $1000,000 $100,000 - - 1003
In the final dataset, I want to merge the two types of dataset and make the EX_AMT (amount of exemption) for each EX_CODE (exemption code) variables.
I have been studying many command such as “foreach” “reshape” but cannot figure out how to do this (I also have at least basic concepts for macros)
This is really important step for my dissertation. Please advice me of the best way.
<mailto:[email protected]>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/