Dear listers:
I have a data set with the following structure:
id d1 d2 d3..... d2500 min_dis
1 0 23 21 530 21
2 23 0
3
4
5
...
(up to 2500)
i.e. number of observation=2500, and each one represent to one station(id)
dX= the distance to stationX, X=1...2500
(since there are 2500 observation,==> I have 2500 distance variables)
min_dis=minimum distance of the nearest station.
So, for each observation(station), I know its minimum distance to another
station.
Now, I want to know its nearest station id.
i.e. I want to have another variable (say called near_id). By this new
variable, I can then obtain the id number of each observation's nearest
station id.
For example (using the above data)
:
id d1 d2 d3..... d2500 min_dis ==> near_id
1 0 23 29 530 21 ==> 2
2 23 0 32 41 23 ==> 1
3 29 32 0 52 21 ==> 2
4
5
...
For this purpose, I use the following programming code.
Basically, I am doing this observation by observation:
gen near_id=.
forvalues i=1(1)2500{
forvalues j=1(1)2500{
replace near_id =`j' if id==`i'&
d`j'==min_dis
}
}
Therefore, there are totally 2500X2500 loops
If each loop takes 2 seconds==> totally, I need 5000 seconds to finish the
whole process, which is 1.4 hours.
Is there any efficient way to do that?
Many thanks.
JT
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/