Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Calculating the shortest distances between observations (based on longitude and latitude)
From
Robert Picard <[email protected]>
To
[email protected]
Subject
Re: st: Calculating the shortest distances between observations (based on longitude and latitude)
Date
Thu, 2 Feb 2012 11:21:48 -0500
Take a look at -geonear- and -geodist-, both available from SSC. If
you have only two observation types, then the simplest approach is to
form all pairwise combinations of locations and then calculate the
distances.
*----------- begin example -------------
version 12
clear
input otype str10 country year lat lon
1 Albania 2010 42.07972 19.52361
1 Albania 2010 42.15028 19.66389
1 Albania 2010 42.01667 19.48333
2 Albania 2010 39.95 20.28333
2 Albania 2010 42.08417 20.42
end
* save type1 and type2 observation separately
tempfile main type2
save "`main'"
keep if otype == 2
rename * *2
gen id2 = _n
save "`type2'"
use "`main'"
keep if otype == 1
gen id1 = _n
* form all pairwise combinations and calculate distance
cross using "`type2'"
geodist lat lon lat2 lon2, gen(d)
sort id1 d
*------------ end example --------------
2012/2/2 Rüdiger Vollmeier <[email protected]>:
> Hello guys,
>
> I want to calculate the shortest distances between observations based
> on the coordinates (latitude, longitude). I have adapted a simple
> version from N. Cox's nearest neighbor search which was presented here
> some time ago. In contrast to that, I want to calulate not only the
> shortest but also the second shortest (third, and so on) distances.
>
> Here is a simplified structure of the dataset:
>
> observation_type country year latitude longitude
> 1 Albania 2010 42.07972 19.52361
> 1 Albania 2010 42.15028 19.66389
> 1 Albania 2010 42.01667 19.48333
> 2 Albania 2010 39.95 20.28333
> 2 Albania 2010 42.08417 20.42
>
> I want to calculate the smallest distances for a given observation of
> observation_type=1 to an observation of type=2 for a given year in a
> given country. Here is the code (all variables are generated of the
> form gen bank_1_dist_1 =.)
>
> * Shortest distance
> local n = _N
> forval i = 1/`n' {
> forval j = 1/`n' {
> if (`i' != `j') & (observation_type[`i']==1) &
> (observation_type[`j']==2) &
> (country_number[`i']==country_number[`j']) & (year[`i']==year[`j']) {
> local d = (latitude[`i'] - latitude[`j'])^2 + (longitude[`i'] -
> longitude[`j'])^2
> replace bank_2010_1_`j'=`d' in `i'
> if `d' < bank_1_dist_1[`i'] {
> replace bank_1_dist_1 = `d' in `i'
> replace bank_1_id_1 = `j' in `i'
> }
> }
> }
> }
> * Second shortest distance
> local n = _N
> forval i = 1/`n' {
> forval j = 1/`n' {
> if (`i' != `j') &(observation_type[`i']==1)
> &(observation_type[`j']==2)
> &(country_number[`i']==country_number[`j']) &(year[`i']==year[`j']) {
> local d2 = (latitude[`i'] - latitude[`j'])^2 + (longitude[`i'] -
> longitude[`j'])^2
> if (`d2' > bank_1_dist_1[`i']) & (`d2' < bank_1_dist_2[`i']) {
> replace bank_1_dist_2 = `d2' in `i'
> replace bank_1_id_2 = `j' in `i'
> }
> }
>
> }
> }
>
> Here is the problem: The shortest distance seems to be well
> calculated. However, the second smallest distance is not calculated
> correctly (sometimes it takes on the same value as the shortest
> distance and only sometimes it is the actual shortest distance). Do
> you know why? Do you have any suggestions for improvement?
>
> Thanks in advance.
> Ruediger
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/