Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Aggregating spatial units for SPMAP
From
Robert Picard <[email protected]>
To
[email protected]
Subject
Re: st: Aggregating spatial units for SPMAP
Date
Sat, 22 Dec 2012 09:22:17 -0800
Thanks to Kit Baum, a much improved version of -mergepoly- with help
file is now available through SSC. To install it, type -ssc install
mergepoly- in Stata. Here's the program description:
mergepoly is used to merge adjacent polygons from a shape
boundary file. Shape boundary files are usually obtained from
outside sources and converted to Stata datasets using -shp2dta-
(SSC). The shape dataset contains one record per geographic feature and the
coordinate dataset describes the shape of each feature in terms
of points. The points, in order, form polygons. mergepoly creates
new polygons that follow the outer border of the adjacent polygons
by removing all shared line segments.
Enjoy, Robert
On Sun, Dec 16, 2012 at 3:33 PM, Robert Picard <[email protected]> wrote:
> I think I may have a solution. The following program takes a
> shapefile and creates new polygons that have the shape of the
> outer borders of the adjacent polygons. It builds on the fact
> that adjacent polygons have common points along line segments
> that are shared. The outer borders are composed of polylines
> (connected line segments) that start and end with a shared point
> with all in-between points unique to a single polygon. It's a bit
> tricky to connect all the outer border polylines, but minimal
> testing suggests that the program works.
>
> Let me know if the program works or not in your case. If no
> problem is reported, I'll add bells and whistles and write up a
> help file and post it to SSC.
>
> Hope this helps, Robert
>
> * --------------------------------------------------------------
> version 12
> clear all
>
> *! 1.0.0 16dec2012 Robert Picard, [email protected]
> program define mergepoly
>
> args id0 lat0 lon0
>
> rename `lat0' lat
> rename `lon0' lon
>
> gen obs = _n
>
> // create new id in case there are multipart polygons
> gen id = sum(mi(lat,lon))
> sort id obs
> qui by id: drop if _n == 1
>
> // each polygon should start and end at the same lat/lon
> by id: assert lat[1] == lat[_N]
> by id: assert lon[1] == lon[_N]
>
> // adjacent polygons have common points
> sort lat lon obs
> by lat lon: gen N = _N
>
> // identify polylines that form the outside borders;
> // they start and end at common points and have a
> // run of non-adjacent points
> sort id obs
> by id: gen pstart = N > 1 & N[_n+1] == 1
> by id: gen pend = N > 1 & N[_n-1] == 1
> by id: gen inpoly = sum(pstart - pend)
> qui replace inpoly = 1 if pend
> gen polyline = sum(pstart) * inpoly
>
> // link polylines by grouping coordinates and
> // identifying the next polyline
> sort lat lon pend pstart obs
> by lat lon: egen pp = total(pstart + pend)
> assert pp <= 2
> by lat lon: assert pend if pp == 2 & _n == _N
> by lat lon: assert pstart if pp == 2 & _n == _N - 1
> qui by lat lon: gen next = polyline[_n-1] if _n == _N & pp == 2
>
> // handle special case of line segments on the outer border;
> // these are not polylines, they just have 2 points and they
> // create a bridge between two polylines.
> sort lat lon pend obs
> by lat lon: gen sameid = id == id[_N]
> sort lat lon pend sameid obs
> qui by lat lon: gen link = obs[1]+1 if pend & pp == 1
> sort lat lon pstart obs
> qui by lat lon: replace link = obs[1] if pstart & pp == 1
> sort pp link pstart pend
> qui by pp link: replace next = polyline[2] if pp == 1 & pend
> qui by pp link: assert _N == 2 if pp == 1 & !mi(link)
>
> qui keep if polyline
> sort polyline obs
> by polyline: gen one = _n == 1
>
> // some polylines may be polygons already
> by polyline: gen ispolyline = lat[1] != lat[_N] | lon[1] != lon[_N]
> sort ispolyline polyline obs
> qui gen newpolygon = sum(one) if !ispolyline
>
> // loop over each polyline in order and build new polygons that
> // follow the outer borders of adjacent polygons
> sum newpolygon, meanonly
> local curpolygon = cond(mi(r(max)), 1, r(max) + 1)
> sum polyline if mi(newpolygon), meanonly
> local nextpl = r(min)
> local more = !mi(r(N))
> local startpline `nextpl'
>
> qui gen pline_order = .
>
> local i 0
> while `more' {
>
> local i = `i' + 1
> qui replace pline_order = `i' if polyline == `nextpl'
> qui replace newpolygon = `curpolygon' if polyline == `nextpl'
>
> sum next if polyline == `nextpl', meanonly
> if `nextpl' == r(max) local startpline = r(max)
> local nextpl = r(max)
>
> if `nextpl' == `startpline' {
> local curpolygon = `curpolygon' + 1
> sum polyline if mi(newpolygon), meanonly
> local nextpl = r(min)
> local startpline `nextpl'
> }
>
> qui count if mi(newpolygon)
> local more = r(N)
> }
>
> // each new polygon must start with missing lat/lon
> sort newpolygon pline_order obs
> qui expand 2 if one
> sort newpolygon pline_order obs
> qui by newpolygon: replace lat = . if _n == 1
> qui by newpolygon: replace lon = . if _n == 1
>
> rename newpolygon ext`id0'
> keep `id0' lat lon ext`id0'
> rename lat `lat0'
> rename lon `lon0'
>
> end
>
> use "http://fmwww.bc.edu/repec/bocode/i/Italy-RegionsCoordinates.dta", clear
>
> mergepoly _ID _Y _X
>
> line _Y _X, lwidth(vvthin) lcolor(blue) cmissing(n) ///
> aspectratio(1.2) xscale(off) yscale(off) ///
> ylabel(minmax, nogrid) xlabel(minmax)
>
> * --------------------------------------------------------------
>
>
>
>
> On Fri, Dec 14, 2012 at 6:11 AM, Karsten Pfaff
> <[email protected]> wrote:
>> Dear Statalist members,
>> I would like to refer first to the following thread that emerged some
>> month earlier on statalist:
>>
>> http://www.stata.com/statalist/archive/2012-06/msg00384.html
>>
>> My problem is similar. I have a shapefile (coordinate file) containing
>> polygon coordinates about a certain type of regions (calles "kreise") in
>> Germany, which allows me to draw a grid of Germany that divides Germany in
>> just these regions. However, I am not interested in the regions themselves
>> but in some kind of higher order regions called "ror". Of course I also
>> have information about which "kreise" are part of which "ror" (just
>> imagine you have coordinates about the counties of the USA and based on
>> these coordinates you would like to get the coordinates of the federal
>> states, knowing which countries belong to which federal state). The data
>> file looks as follows (illustrative example):
>>
>> ror kreise (=_ID) _Y _X
>> 101 1 9.0468785 54.872552
>> 101 1 9.0646343 54.871183
>> 101 1 9.0649463 54.871159
>> 101 1 9.090514 54.70260
>> 101 2
>> 101 2
>> 101 2 some coordinates for kreise 2 & 3
>> 101 2
>> 101 2
>> 101 2
>> 102 3
>> 102 3
>>
>> So obviously, ror 101 contains kreise 1 and 2.
>> Based on these information, I would like to create a shapefile that draws
>> polygones of each ror, and ignores the "within" borders. Is there any
>> possibility to do so? I think there must be some way since all the
>> necessary information to create a map containing but the rors are at hand.
>> Any help or suggestion is highly appreciated.
>>
>> Best regards Karsten Pfaff
>>
>>
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/