Something like this should work:
* First, I'll create some dummy dataset
clear
cd D:\temporal
set mem 100m
set more off
set obs 5000
gen pat_id = _n
gen pat_x = uniform()*100
gen pat_y = uniform()*100
save patients, replace
clear
set obs 1000
gen hos_id = _n
gen hos_x = uniform()*100
gen hos_y = uniform()*100
save hospitals, replace
clear
****************************
Look at the varnames I've used here... hos_x and hos_y tell you the
coordinates of the hospitals; the same for pat_x and pat_y. If the
names are the same for patients and hospitals (ie, you only have two
variables called x and y), then just modify the following dofile.
Also, look at pat_id and hos_id, also conventions of this do-file.
Modify the dofile to fit your data.
***************************
* This is the script:
use patients
local N1 = 1+c(N)
tempvar temp
gen dist=c(maxdouble)
gen `temp'=.
gen closer_hosp=0
append using hospitals
local N2 = c(N)
forval i=`N1'/`N2' {
qui replace `temp' = sqrt( (pat_x-hos_x[`i'])^2 + (pat_y-hos_y[`i'])^2 )
qui replace closer_hos = cond(`temp'<dist,hos_id[`i'],closer_hos)
qui replace dist = min(`temp',dist)
}
keep in 1/`=`N1'-1'
********************
If you want to check that the program works, just run it for a very
small dataset (like 1 patient and 5 hospitals) and then check by hand
or using the following graph:
tw (sc pat_x pat_y) (sc hos_x hos_y, mlabel(hos_id))
AFAIK, the program runs fine for the data I've used. I'm sure that it
could be improved (I could create an ADO file that does not depends on
the specific names I've set), but time is short.
Cheers,
Sergio
On 1/3/07, Richardson, Kelly K. <[email protected]> wrote:
Dear Statalisters,
I would like to calculate the distance between a patient's home and the
nearest hospital to their home. I have the x,y coordinates for the
patient's home and the x,y coordinates for every hospital in the
country. How do I get the distance to the nearest hospital? I looked at
the globdist command but I can't see how to enter coordinate variables
for both locations. For example, say I have 1000 patients across the
country and 5000 hospitals. How do I get one distance for each patient
that represents the distance to the nearest hospital? I'm not clear on
how my data should be constructed or which command to use. Right now the
data are in separate files, one for patients (n=1000) and one for
hospitals (n=5000). Each file has an x-coordinate and a y-coordinate
identifying the location of either the patient's home or the hospital
address. There is no unique identifier linking the patients to any of
the hospitals. In fact, I have no way of knowing where any patient
actually went. I just need to know where the closest hospital is to
where they live. I am also interested in identifying the nearest VA
Hospital versus the nearest Non-VA hospital. I have an indicator in the
hospital data file (1=VA 0=non-VA). In the end I hope to have 2 distance
measures for each individual. I'm not sure Stata is appropriate for this
task. Any suggestions would be greatly appreciated.
Thank you,
Kelly
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/