Sarah --
Note the problem of hospitals and patients I referenced, though it
illustrates the idea of looping over obs and calculating distance, is
not exactly analogous--it involved two datasets, for one. But
http://www.stata.com/statalist/archive/2007-01/msg00098.html
is what I should have referenced, in any case.
Also, it occurs to me: why the 100 nearest? Why not weight by the
reciprocal of the square of distance over all obs, or somesuch? For a
relevant discussion, see Appendix A of
http://www.nber.org/papers/w13246
On 10/10/07, Austin Nichols <[email protected]> wrote:
> Sarah--
> To identify the nearest 100 obs, you will need 100 new variables
> holding the ID for each of those neighbors; then calculating the
> additional variables will also be nontrivial. Far better to calculate
> whatever you need in a single loop over all observations. See
> http://www.stata.com/statalist/archive/2007-01/msg00079.html
> for more detail.
>
> The key is to calculate for each i the distance to all _N-1 not-i obs
> and then sort by distance and then calculate summary stats on the
> first 100 obs with an in 1/100 qualification. Also you might want to
> calculate distance using a spherical approximation to the Earth's
> surface (but see -findit vincenty- for an ellipsoidal approximation).
>
> On 10/10/07, Sarah Cohodes <[email protected]> wrote:
> > Dear Statalisters:
> >
> > I have the longitude and latitude of each of my observations. I'd
> > like to identify the 100 nearest neighbors of each observation, so I
> > can ultimately calculate some variables based on those nearest
> > neighbors, for example the average test score of the 100 nearest
> > neighbors. I've identified a strategy to do this, but I'm stuck
> > along the way. However, if someone has another suggestion on how to
> > approach the problem, I'd really appreciate it, especially if it is
> > less computationally intensive, as I have over 100,000 observations.
> >
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/