[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: Merge changing census tracts

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Merge changing census tracts
Date	Fri, 6 May 2005 18:00:54 +0100
You can probably do this in several passes: 

0. Check on births and deaths:  

count if year1code == "00" & year2code == "00"

<should be zero> 

1. Classify new ones and abolished ones: 

egen g = group(year2code) if year1code == "00" 
su g, meanonly 
local ng = r(max) 

egen g2 = group(year1code) if year2code == "00"

replace g = g2 + `ng' if year2code == "00" 
su g, meanonly 
local ng = r(max) 

2. Classify year2 codes not yet classified: 

egen g3 = group(year2code) if g == . 

If these occur several times, areas 
have merged, so you need to count them 
to identify the multiples. 

... 

3. Classify year1 codes not yet classified: 

egen g4 = group(year1code) if g == . 

... 

(Not complete. Just some possibilities...) 

Nick 
[email protected] 

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Julia Gamas
> Sent: 06 May 2005 16:22
> To: [email protected]
> Subject: st: Merge changing census tracts
> 
> 
> Dear all,
> the equivalence table in my example is structured as follows:
> 
> year 1 year 2
> Code   code  what happened to tract over time:
> 
> xx     lm    census tract from year 1 split into two 
> xx     pq    "
> pl     pl    census tract remained but got bigger because
> da     pl    census tracts from year 1 merged into an existing
> gr     pl    tract
> fa     fa    census tract remained the same
> 00     no    census tract did not exist in year 1 but appeared in year
> 00     lp    "
> ll     00    census tract existed in year 1 but disappeared in year 2
> pp     00    "
> 
> Is this the information you need?
> Thank you for your help.
> 
> Julia
> 
> Date: Thu, 5 May 2005 21:22:22 +0100
> From: "Nick Cox" <[email protected]>
> Subject: st: RE: Merge chaniging census tracts
> 
> Evidently there is an equivalence table that tells 
> you that the mapping is 
> 
>    xx         -> lm, pq 
>    pl, da, gr -> pl         
>    fa         -> fa 
>               -> no 
> 
>    etc. 
> 
> independently of the two files. You need to 
> tell us how that is structured, I believe. 
> 
> Nick 
> [email protected] 
> 
> Julia Gamas
> 
> > I have to merge two databases from two different years where 
> > each observation is
> > a census tract. The challenge is avoiding double counting 
> > after the merge, as a
> > result of census tracts changing through time. I would like 
> > to create a code
> > that identifies the groups of tracts that split up or merged 
> > but am not sure
> > how to do it.  Other than doing it by hand (I have 4000 
> > tracts), I'm wondering
> > if there is a somewhat automatic way to do this in Stata.
> > Currently, if I were to merge the two datasets using the
> > corresponding equivalence table I would get something like this.
> > year 1     year 2
> > Code pop  code  pop  what happened to tract over time:
> > xx    22  lm     10  census tract from year 1 split into two 
> > xx    22  pq     5   "
> > pl    12  pl     5   census tract remained but got bigger because
> > da     5  pl     5   census tracts from year 1 merged into 
> an existing
> > gr     1  pl     5   tract
> > fa     8  fa     16  census tract remained the same
> > 00     0  no     10  census tract did not exist in year 1 but 
> > appeared in year
> > 2
> > 00     0  lp     8   "
> > ll    10  00     0   census tract existed in year 1 but 
> > disappeared in year 2
> > pp    10  00     0   "
> > 
> > If at this point I were to collapse (sum) the population for 
> > year 2, for
> > example, then I would end up triple counting population in 
> > census tract pl,
> > similarly, if I were to collapse (sum) population for year 1. 
> >  Population for
> > pl in year 2 would be counted as 15 when it is only 5.  
> > Population for tract xx
> > would be counted as 44 when it is only 22.  As an 
> > intermediate step I was
> > thinking of doing the following:
> >          year 1     year 2
> > newcode  Code  pop code  pop  
> > 1        xx    22  lm     10  
> > 1        xx    22  pq     5   
> > 2        pl    12  pl     5   
> > 2        da     5  pl     5   
> > 2        gr     1  pl     5   
> > 3        fa     8  fa     16  
> > 4        00     0  no     10  
> > 5        00     0  lp     8   
> > 6        ll    10  00     0    
> > 7        pp    10  00     0   
> > 
> > That way, I could keep track of the population for each tract 
> > or "group" of
> > tracts and end up with
> > something like this:
> >          year 1   year 2
> > newcode  pop      pop  
> > 1        22        15 
> > 2        18         5      
> > 3        8         16   
> > 4        0         10 
> > 5        0          8
> > 6        10         0  
> > 7        10         0
> > How do I tell stata how to generate newcode (i.e. can I get 
> > stata to recognize
> > groups of census tracts with the same code and assign them a 
> > newcode, for both
> > years)?
> > Any help would be appreciated.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: RE: st: RE: Population Attributable Fraction (PAF) estimation
Next by Date: Re: st: Shell
Previous by thread: st: re: shell
Next by thread: st: Rolling correlations
Index(es):
- Date
- Thread