You can probably do this in several passes:
0. Check on births and deaths:
count if year1code == "00" & year2code == "00"
<should be zero>
1. Classify new ones and abolished ones:
egen g = group(year2code) if year1code == "00"
su g, meanonly
local ng = r(max)
egen g2 = group(year1code) if year2code == "00"
replace g = g2 + `ng' if year2code == "00"
su g, meanonly
local ng = r(max)
2. Classify year2 codes not yet classified:
egen g3 = group(year2code) if g == .
If these occur several times, areas
have merged, so you need to count them
to identify the multiples.
...
3. Classify year1 codes not yet classified:
egen g4 = group(year1code) if g == .
...
(Not complete. Just some possibilities...)
Nick
[email protected]
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Julia Gamas
> Sent: 06 May 2005 16:22
> To: [email protected]
> Subject: st: Merge changing census tracts
>
>
> Dear all,
> the equivalence table in my example is structured as follows:
>
> year 1 year 2
> Code code what happened to tract over time:
>
> xx lm census tract from year 1 split into two
> xx pq "
> pl pl census tract remained but got bigger because
> da pl census tracts from year 1 merged into an existing
> gr pl tract
> fa fa census tract remained the same
> 00 no census tract did not exist in year 1 but appeared in year
> 00 lp "
> ll 00 census tract existed in year 1 but disappeared in year 2
> pp 00 "
>
> Is this the information you need?
> Thank you for your help.
>
> Julia
>
> Date: Thu, 5 May 2005 21:22:22 +0100
> From: "Nick Cox" <[email protected]>
> Subject: st: RE: Merge chaniging census tracts
>
> Evidently there is an equivalence table that tells
> you that the mapping is
>
> xx -> lm, pq
> pl, da, gr -> pl
> fa -> fa
> -> no
>
> etc.
>
> independently of the two files. You need to
> tell us how that is structured, I believe.
>
> Nick
> [email protected]
>
> Julia Gamas
>
> > I have to merge two databases from two different years where
> > each observation is
> > a census tract. The challenge is avoiding double counting
> > after the merge, as a
> > result of census tracts changing through time. I would like
> > to create a code
> > that identifies the groups of tracts that split up or merged
> > but am not sure
> > how to do it. Other than doing it by hand (I have 4000
> > tracts), I'm wondering
> > if there is a somewhat automatic way to do this in Stata.
> > Currently, if I were to merge the two datasets using the
> > corresponding equivalence table I would get something like this.
> > year 1 year 2
> > Code pop code pop what happened to tract over time:
> > xx 22 lm 10 census tract from year 1 split into two
> > xx 22 pq 5 "
> > pl 12 pl 5 census tract remained but got bigger because
> > da 5 pl 5 census tracts from year 1 merged into
> an existing
> > gr 1 pl 5 tract
> > fa 8 fa 16 census tract remained the same
> > 00 0 no 10 census tract did not exist in year 1 but
> > appeared in year
> > 2
> > 00 0 lp 8 "
> > ll 10 00 0 census tract existed in year 1 but
> > disappeared in year 2
> > pp 10 00 0 "
> >
> > If at this point I were to collapse (sum) the population for
> > year 2, for
> > example, then I would end up triple counting population in
> > census tract pl,
> > similarly, if I were to collapse (sum) population for year 1.
> > Population for
> > pl in year 2 would be counted as 15 when it is only 5.
> > Population for tract xx
> > would be counted as 44 when it is only 22. As an
> > intermediate step I was
> > thinking of doing the following:
> > year 1 year 2
> > newcode Code pop code pop
> > 1 xx 22 lm 10
> > 1 xx 22 pq 5
> > 2 pl 12 pl 5
> > 2 da 5 pl 5
> > 2 gr 1 pl 5
> > 3 fa 8 fa 16
> > 4 00 0 no 10
> > 5 00 0 lp 8
> > 6 ll 10 00 0
> > 7 pp 10 00 0
> >
> > That way, I could keep track of the population for each tract
> > or "group" of
> > tracts and end up with
> > something like this:
> > year 1 year 2
> > newcode pop pop
> > 1 22 15
> > 2 18 5
> > 3 8 16
> > 4 0 10
> > 5 0 8
> > 6 10 0
> > 7 10 0
> > How do I tell stata how to generate newcode (i.e. can I get
> > stata to recognize
> > groups of census tracts with the same code and assign them a
> > newcode, for both
> > years)?
> > Any help would be appreciated.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/