Benoit Dulong
> >
> > x=(x1,x2), a point in R^2.
> >
> > My dataset (simulation), has at least 200 points.
> > For each point in the dataset, I want:
> > 1- identify the nearest neighbor (nnid)
> > 2- calculate the distance to that nearest neighbor (nnd)
> > How can I create nnid and nnd ?
> >
> > list in 1/10, noobs
> >
> > id x1 x2 nnid nnd
> > 1 0.6231 0.6594 . .
> > 2 0.0770 0.8497 . .
> > 3 0.8031 0.5251 . .
> > 4 0.4283 0.2249 . .
> > 5 0.2084 0.1750 . .
> > 6 0.8936 0.9179 . .
> > 7 0.6168 0.7379 . .
> > 8 0.5663 0.2539 . .
> > 9 0.6465 0.5444 . .
> > 10 0.7783 0.0047 . .
> >
Stephen Jenkins
> How about something like the following:
>
> 1. Put x1 and x2 in separate data sets A1, A2, each with n rows,
> including in each a row-specific unique identifier
> 2. create a long format data set of n x n rows which contains all
> possible (x1,x2) pairs,
> 3. using your favourite distance metric formula, d(.), calculate
> d(x1_i,x2_i) for i = 1,...,nxn, and from this you will also
> get your nearest neighbour = obs with min d(), by i.
> [E.g use some -egen- function, with a -by- id option].
> 4. Tag the relevant nearest neighbour observations, and
> save them in a
> file together with row number id
> 5. Merge back on to original data set, using i as key.
You can also do it in place without any need for fiddling around
with files. This would probably get
a D in any computer science course, but it
should be practical enough for the sample sizes implied.
Euclid -- or perhaps Pythagoras -- wired in. One line
to change if you want some other definition of distance.
program def nearest
*! NJC 1.0.0 9 January 2003
version 7
syntax varlist(min=2 max=2 numeric) [if] [in] , id(string)
dist(string)
confirm new var `id'
confirm new var `dist'
marksample touse
tokenize `varlist'
args x y
qui {
gen `id' = .
gen `dist' = .
tempname d
local n = _N
forval i = 1/`n' {
forval j = 1/`n' {
if `touse'[`i'] & (`i' != `j') {
scalar `d' = /*
*/ (`x'[`i'] - `x'[`j'])^2 + (`y'[`i'] - `y'[`j'])^2
if `d' < `dist'[`i'] {
replace `dist' = `d' in `i'
replace `id' = `j' in `i'
}
}
}
}
replace `dist' = sqrt(`dist')
}
end
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/