Evidently there is an equivalence table that tells
you that the mapping is
xx -> lm, pq
pl, da, gr -> pl
fa -> fa
-> no
etc.
independently of the two files. You need to
tell us how that is structured, I believe.
Nick
[email protected]
Julia Gamas
> I have to merge two databases from two different years where
> each observation is
> a census tract. The challenge is avoiding double counting
> after the merge, as a
> result of census tracts changing through time. I would like
> to create a code
> that identifies the groups of tracts that split up or merged
> but am not sure
> how to do it. Other than doing it by hand (I have 4000
> tracts), I'm wondering
> if there is a somewhat automatic way to do this in Stata.
> Currently, if I were to merge the two datasets using the
> corresponding equivalence table I would get something like this.
> year 1 year 2
> Code pop code pop what happened to tract over time:
> xx 22 lm 10 census tract from year 1 split into two
> xx 22 pq 5 "
> pl 12 pl 5 census tract remained but got bigger because
> da 5 pl 5 census tracts from year 1 merged into an existing
> gr 1 pl 5 tract
> fa 8 fa 16 census tract remained the same
> 00 0 no 10 census tract did not exist in year 1 but
> appeared in year
> 2
> 00 0 lp 8 "
> ll 10 00 0 census tract existed in year 1 but
> disappeared in year 2
> pp 10 00 0 "
>
> If at this point I were to collapse (sum) the population for
> year 2, for
> example, then I would end up triple counting population in
> census tract pl,
> similarly, if I were to collapse (sum) population for year 1.
> Population for
> pl in year 2 would be counted as 15 when it is only 5.
> Population for tract xx
> would be counted as 44 when it is only 22. As an
> intermediate step I was
> thinking of doing the following:
> year 1 year 2
> newcode Code pop code pop
> 1 xx 22 lm 10
> 1 xx 22 pq 5
> 2 pl 12 pl 5
> 2 da 5 pl 5
> 2 gr 1 pl 5
> 3 fa 8 fa 16
> 4 00 0 no 10
> 5 00 0 lp 8
> 6 ll 10 00 0
> 7 pp 10 00 0
>
> That way, I could keep track of the population for each tract
> or "group" of
> tracts and end up with
> something like this:
> year 1 year 2
> newcode pop pop
> 1 22 15
> 2 18 5
> 3 8 16
> 4 0 10
> 5 0 8
> 6 10 0
> 7 10 0
> How do I tell stata how to generate newcode (i.e. can I get
> stata to recognize
> groups of census tracts with the same code and assign them a
> newcode, for both
> years)?
> Any help would be appreciated.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/