On Thu, 09 Jan 2003 01:54:22 -0500 Benoit Dulong <[email protected]>
wrote:
>
> x=(x1,x2), a point in R^2.
>
> My dataset (simulation), has at least 200 points.
> For each point in the dataset, I want:
> 1- identify the nearest neighbor (nnid)
> 2- calculate the distance to that nearest neighbor (nnd)
> How can I create nnid and nnd ?
>
> list in 1/10, noobs
>
> id x1 x2 nnid nnd
> 1 0.6231 0.6594 . .
> 2 0.0770 0.8497 . .
> 3 0.8031 0.5251 . .
> 4 0.4283 0.2249 . .
> 5 0.2084 0.1750 . .
> 6 0.8936 0.9179 . .
> 7 0.6168 0.7379 . .
> 8 0.5663 0.2539 . .
> 9 0.6465 0.5444 . .
> 10 0.7783 0.0047 . .
>
How about something like the following:
1. Put x1 and x2 in separate data sets A1, A2, each with n rows,
including in each a row-specific unique identifier
2. create a long format data set of n x n rows which contains all
possible (x1,x2) pairs,
3. using your favourite distance metric formula, d(.), calculate
d(x1_i,x2_i) for i = 1,...,nxn, and from this you will also
get your nearest neighbour = obs with min d(), by i.
[E.g use some -egen- function, with a -by- id option].
4. Tag the relevant nearest neighbour observations, and save them in a
file together with row number id
5. Merge back on to original data set, using i as key.
Stephen
----------------------
Professor Stephen P. Jenkins <[email protected]>
Institute for Social and Economic Research (ISER)
University of Essex, Colchester, CO4 3SQ, UK
Tel: +44 (0)1206 873374. Fax: +44 (0)1206 873151.
http://www.iser.essex.ac.uk
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/