David Torres<[email protected]> :
It's still not clear to me what you hope to achieve. I also don't see
any added value in Sergio's pointing you to the SSC program -vincenty-
(though that program may help you later if you calculate a weighted
average over all schools in the country, where distances involved may
be both great and small). The example code you quoted at the outset
gives the general method you need to adapt to your data, except that
you need to calculate centroids of some existing polygons first, for
which code follows. I am assuming that you are working with polygons
small enough that you can assume that distances can be measured as if
they lay in a plane; otherwise centroids will be difficult to
calculate. Also note centroids need not lie inside a polygon, if
polygons may be nonconvex, which may undercut your intuition about
using centroids to measure distances between points and sets. See
also appendices A and B in http://www.nber.org/papers/w13246 for
discussion of distance-weighted measures. Here is a program for
calculating centroids, and a couple of examples:
clear all
prog calcentr, sortpreserve
version 8.2
syntax varlist [,by(varlist) Stub(string) Force]
gettoken X Y: varlist
if "`stub'"=="" loc stub "c"
if "`by'"=="" {
tempvar by
g byte `by'=1
}
tempvar t x y xm ym sx sy sa a b
qui {
g long `t'=_n
su `X', meanonly
loc xo=r(min)
g double `x'=`X'-`xo'
su `Y', meanonly
loc yo=r(min)
g double `y'=`Y'-`yo'
egen long `b'=group(`by')
sort `by' `t'
if "`force'"=="" by `by': assert `x'[1]==`x'[_N]
if "`force'"=="" by `by': assert `y'[1]==`y'[_N]
by `by' : g double `ym'=(`y'+`y'[_n+1])*(`x'*`y'[_n+1]-`x'[_n+1]*`y')
by `by' : g double `xm'=(`x'+`x'[_n+1])*(`x'*`y'[_n+1]-`x'[_n+1]*`y')
by `by' : g double `a'=(`x'*`y'[_n+1]-`x'[_n+1]*`y')/2
egen `sa'=sum(`a'), by(`b')
egen `sy'=sum(`ym'), by(`b')
replace `sy'=`sy'/6/`sa'+(`yo')
egen `sx'=sum(`xm'), by(`b')
replace `sx'=`sx'/6/`sa'+(`xo')
}
if "`stub'"!="" {
ren `sx' `stub'x
ren `sy' `stub'y
}
end
input x y
.9 0
0 1
.9 2
1 2
.1 1
1 0
.9 0
end
calcentr x y, f
line y x || sc cy cx, aspect(2) name(outside)
use http://pped.org/stata/usa.dta, clear
calcentr _X _Y, by(_ID) stub(b) f
line _Y _X if _ID==10, aspect(1) || scatter by bx if _ID==10
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/