Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Maurizio Pisati <maurizio.pisati@unimib.it> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Randomize points to map polygons using -spmap- ? |
Date | Tue, 16 Jul 2013 11:06:10 +0200 |
Dear Andrew, what do you mean exactly by "randomize cases within NAME_1"? Anyway, please find below a Stata command I wrote a few years ago for personal use (i.e., no help file and possible bugs) which might be of help. It is called -csrpp- and generates a given number of random points within a set of polygons specified by the user. Using your Andorra map, suppose you'd like to generate 50 random cases within each city. Here's a way to go: . use "ADMdb.dta", clear . generate cases = 50 . csrpp cases using "ADMcoord.dta", id(_ID) saving("casecoord.dta") replace The Stata dataset "casecoord.dta" contains the coordinates of the 50x7 random points: . spmap using "ADMcoord.dta", id(_ID) point(data("casecoord.dta") x(_X) y(_Y)) Best wishes, Maurizio ===== CODE STARTS HERE ============================================================= *! -csrpp-: Generate completely spatially random points in a polygon *! Version 1.0 - 19 August 2006 *! Author: Maurizio Pisati *! Department of Sociology and Social Research *! University of Milano Bicocca (Italy) *! maurizio.pisati@unimib.it * ---------------------------------------------------------------------------- * 1. Define program * ---------------------------------------------------------------------------- program csrpp version 9.2 * ---------------------------------------------------------------------------- * 2. Define syntax * ---------------------------------------------------------------------------- syntax varname(numeric) [if] [in] [iweight/] using/, /// ID(varname numeric) /// [Scale(numlist max=1 >0)] /// SAVing(string) /// [REPLACE] /// [noDOTS] * ---------------------------------------------------------------------------- * 3. Check syntax * ---------------------------------------------------------------------------- /* Check weight */ if ("`weight'" != "") { tempvar W qui gen `W' = `exp' qui su `W' if r(min) < 0 { di as err "{p}The weighting variable can take only nonnegative " /// "values{p_end}" exit 198 } } else local exp = 1 /* Check option id() */ capture isid `id' if _rc { di as err "{p}Variable `id' specified in option {bf:{ul:id}()} " /// "does not uniquely identify the polygons{p_end}" exit 459 } /* Check option saving() */ if (substr(reverse("`saving'"),1,4) != "atd.") { local region "`saving'.dta" } /* Marksample */ marksample TOUSE markout `TOUSE' `id' qui count if `TOUSE' if (r(N) == 0) error 2000 * ---------------------------------------------------------------------------- * 4. Define basic objects * ---------------------------------------------------------------------------- /* Preserve data */ preserve /* Select relevant observations */ qui keep if `TOUSE' /* Set default scale factor */ if ("`scale'" == "") local scale = 1 /* Set option dots */ if ("`dots'" != "") local dots "no" else local dots "yes" * ---------------------------------------------------------------------------- * 5. Data setup * ---------------------------------------------------------------------------- /* Generate id variable */ cap rename `id' _ID sort _ID /* Generate variables stating number of points to be generated */ unab varlist : `varlist' cap rename `varlist' NP qui replace NP = round(NP * `scale') /* Generate weight variable */ qui gen W = `exp' * (1 / `scale') /* Save temporary file */ keep _ID NP W tempfile DATA qui save `DATA', replace * ---------------------------------------------------------------------------- * 6. Polygon setup * ---------------------------------------------------------------------------- /* Open file */ use "`using'", clear /* Generate subpolygon id */ quietly { gen POLY = (_X == .) replace POLY = sum(POLY) drop if _X == . sort POLY, stable tempfile POLY save `POLY', replace } /* Compute polygon area */ quietly { by POLY : gen double AREA = (_X*_Y[_n+1]) - (_X[_n+1]*_Y) if _n>1 & _n<_N by POLY : replace AREA = sum(AREA) by POLY : replace AREA = abs(AREA[_N] / 2) } /* Compute subpolygon weigths */ collapse AREA (count) NV = AREA, by(_ID POLY) quietly { by _ID : gen double CUM = sum(AREA) by _ID : replace AREA = AREA / CUM[_N] drop CUM } * ---------------------------------------------------------------------------- * 7. Prepare working dataset * ---------------------------------------------------------------------------- /* Merge files and adjust NP variable */ quietly { merge _ID using `DATA' keep if _merge == 3 replace NP = round(AREA * NP) gen ID = _n } /* Select relevant variables and save dataset */ keep POLY ID NP NV W order POLY ID NP NV W sort POLY qui save `DATA', replace /* Select relevant records from polygon file */ quietly { use `POLY', clear merge POLY using `DATA' keep if _merge == 3 keep ID _X _Y ren ID _ID order _ID sort _ID, stable save `POLY', replace } /* Open data */ use `DATA', clear keep ID NP NV W qui gen END = sum(NV) qui gen START = END[_n-1] + 1 qui replace START = 1 in 1 drop NV order ID NP START END W /* Count polygons */ qui count local NPOLY = r(N) /* Count points to be generated */ su NP, mean local NPOINTS = r(sum) /* Merge polygon file */ qui merge using `POLY' drop _merge ID _ID /* Set number of records */ qui count if (`NPOINTS' > r(N)) qui set obs `NPOINTS' /* Generate variables */ qui gen X = . qui gen Y = . qui gen _W = . * ---------------------------------------------------------------------------- * 8. Generate points and save dataset * ---------------------------------------------------------------------------- /* Generate points */ mata: csrpp(`NPOLY', "`dots'") /* Create and select relevant variables */ keep X Y _W ren X _X ren Y _Y lab var _X "x-coordinate" lab var _Y "y-coordinate" lab var _W "Weight" /* Save dataset */ qui drop if _X == . qui compress save "`saving'", `replace' * ---------------------------------------------------------------------------- * 9. End program * ---------------------------------------------------------------------------- restore end /* ----------------------------------------------------------------------- */ /* Call mata */ /* ----------------------------------------------------------------------- */ version 9.2 mata: mata clear mata set matastrict on /* ----------------------------------------------------------------------- */ /* csrpp() */ /* ----------------------------------------------------------------------- */ void csrpp(real scalar npoly, string scalar dots) { real matrix POLY, COOR real scalar i, line real scalar np, we, fr, lr, fi, li if (dots == "yes") { printf("\n") printf("{txt}Scanning polygons ({res}%g{txt})\n", npoly) printf("{txt}{hline 4}{c +}{hline 3} 1 {hline 3}{c +}{hline 3} 2 ") printf("{txt}{hline 3}{c +}{hline 3} 3 {hline 3}{c +}{hline 3} 4 ") printf("{txt}{hline 3}{c +}{hline 3} 5\n") displayflush() } fi = 1 li = 0 for (i=1; i<=npoly; i++) { if (dots == "yes") { line = mod(i,50) if (line != 0 & i < npoly) { printf("{txt}.") displayflush() } if (line == 0 & i < npoly) { printf("{txt}. %5.0f\n", i) displayflush() } if (i == npoly) { printf("{txt}.\n") printf("\n") displayflush() } } np = _st_data(i, 1) if (np > 0) { we = _st_data(i, 4) li = li + np fr = _st_data(i, 2) lr = _st_data(i, 3) POLY = st_data((fr,lr), (5..6)) COOR = sp_csrpp(np, POLY) st_store((fi,li), (7..8), COOR) st_store((fi,li), 9, J(np, 1, we)) fi = li + 1 } } } /* ----------------------------------------------------------------------- */ /* sp_csrpp() */ /* -> sp_pip() */ /* -> sp_polyarea() */ /* */ /* Returns matrix npoints-by-2 containing the coordinates of npoints */ /* completely spatially random points lying within polygon POLY */ /* ----------------------------------------------------------------------- */ real matrix sp_csrpp(real scalar npoints, real matrix POLY) { /* Setup */ real matrix COOR real matrix XC, YC, XY, PIP, DONE real scalar xmin, xmax, ymin, ymax, areap, areab, aratio real scalar ndone, nmiss, ngen /* Generate working objects */ xmin = min(POLY[., 1]) xmax = max(POLY[., 1]) ymin = min(POLY[., 2]) ymax = max(POLY[., 2]) areap = sp_polyarea(POLY) areab = abs(xmax - xmin) * abs(ymax - ymin) aratio = areab / areap /* Generate points */ ndone = 0 while (ndone < npoints) { nmiss = npoints - ndone ngen = round(nmiss * aratio) if (ngen == 1) ngen = 10 XC = xmin :+ uniform(ngen, 1) :* (xmax - xmin) YC = ymin :+ uniform(ngen, 1) :* (ymax - ymin) XY = XC , YC PIP = sp_pip(XY, POLY) DONE = select(XY, PIP) if (ndone == 0) COOR = DONE else COOR = COOR \ DONE ndone = rows(COOR) } /* Return results */ return(COOR[(1..npoints), .]) } /* ----------------------------------------------------------------------- */ /* sp_pip() */ /* -> sp_spip() */ /* */ /* Returns matrix R-by-1 indicating whether each point in R-by-2 matrix */ /* POINTS lies within (value 1) or without (value 0) polygon POLY */ /* ----------------------------------------------------------------------- */ real matrix sp_pip(real matrix POINTS, real matrix POLY) { /* Setup */ real matrix PIP real scalar np real scalar i, x, y /* Generate working objects */ np = rows(POINTS) PIP = J(np, 1, 0) /* Check and report points status */ for (i=1; i<=np; i++) { x = POINTS[i, 1] y = POINTS[i, 2] PIP[i] = sp_spip(x, y, POLY) } /* Return results */ return(PIP) } /* ----------------------------------------------------------------------- */ /* sp_polyarea() */ /* */ /* Returns scalar containing the area of polygon POLY */ /* ----------------------------------------------------------------------- */ real scalar sp_polyarea(real matrix POLY) { /* Setup */ real scalar area real scalar nv, i, x1, y1, x2, y2 /* Generate working objects */ area = 0 nv = rows(POLY) if (POLY[1,1] == POLY[nv,1] & POLY[1,2] == POLY[nv,2]) nv = nv - 1 /* Compute polygon area */ for (i=1; i<=nv; i++) { x1 = POLY[i, 1] y1 = POLY[i, 2] if (i == nv) { x2 = POLY[1, 1] y2 = POLY[1, 2] } else { x2 = POLY[i+1, 1] y2 = POLY[i+1, 2] } area = area + ((x2 - x1) * (y2 + y1)) } /* Return results */ return(abs(area) * 0.5) } /* ----------------------------------------------------------------------- */ /* sp_spip() */ /* */ /* Returns scalar indicating whether single point (x,y) lies within */ /* (value 1) or without (value 0) polygon POLY */ /* ----------------------------------------------------------------------- */ real scalar sp_spip(real scalar x, real scalar y, real matrix POLY) { /* Setup */ real scalar pip real scalar nv, iwind, xlastp, ylastp, ioldq, inewq real scalar xthisp, ythisp real scalar i, a, b /* Generate working objects */ nv = rows(POLY) if (POLY[1,1] == POLY[nv,1] & POLY[1,2] == POLY[nv,2]) nv = nv - 1 iwind = 0 xlastp = POLY[nv, 1] ylastp = POLY[nv, 2] ioldq = sp_spip_aux(xlastp, ylastp, x, y) /* Check and report point status */ for (i=1; i<=nv; i++) { xthisp = POLY[i, 1] ythisp = POLY[i, 2] inewq = sp_spip_aux(xthisp, ythisp, x, y) if (ioldq != inewq) { if (mod(ioldq+1, 4) == inewq) { iwind = iwind + 1 } else if (mod(inewq+1, 4) == ioldq) { iwind = iwind - 1 } else { a = (ylastp - ythisp) * (x - xlastp) b = xlastp - xthisp a = a + ylastp * b b = b * y if (a > b) { iwind = iwind + 2 } else { iwind = iwind - 2 } } } xlastp = xthisp ylastp = ythisp ioldq = inewq } pip = abs(iwind / 4) /* Returns results */ return(pip) } /* Auxiliary function */ real scalar sp_spip_aux(real scalar xp, real scalar yp, real scalar xo, real scalar yo) { real scalar iq if(xp < xo) { if(yp < yo) iq = 2 else iq = 1 } else { if(yp < yo) iq = 3 else iq = 0 } return(iq) } /* ----------------------------------------------------------------------- */ /* Exit Mata */ /* ----------------------------------------------------------------------- */ end ===== CODE ENDS HERE =============================================================== > Hello Sergiy, > > Yes, I would certainly prefer to use Stata; however there appear to be > quite a few options in R, eg -spatstat- and -rgdal- among others. > > I am stymied on how to a) form a grid and b) assign properly > randomized points as you suggest. My sample frame is city data (at the > ADM3+ (commune) level). > > Some typos in previous example. > > ******** Revised example starts here ******** > > * cd "User/Data/" > > * open-source shapefile for Andorra(AND) > > copy "http://gadm.org/data/shp/AND_adm.zip"; AND_adm.zip, replace > unzipfile AND_adm.zip, replace > > shp2dta using "AND_adm1", database(ADMdb) coordinates(ADMcoord) replace > > use ADMdb, clear > > gen cases = ceil(uniform() * 25) > desc > > * use ADMcoord, clear > * desc > > ******** Example ends here ******** > > thanks for the suggestions and the Ukraine example- > > cheers > Andrew Lover > ___________________________________________________________ > Epidemiologist > Centre for Infectious Disease Epidemiology Research (CIDER) > Saw Swee Hock School of Public Health > National University of Singapore > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/