Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Robert Picard <picard@netbox.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: RE: Calculate Distance between Properties within Portfolios |
Date | Wed, 11 Sep 2013 01:29:12 -0400 |
As Joe said, -joinby- is the tool to go if you are going to do this using a brute force approach. You can also get the big guns and use -geonear- (from SSC). You will have to do each portfolio separately but it's still going to be faster than the brute force approach. * --------------- begin example --------------------------- set seed 1234 clear set obs 20 gen porfolio_id = 1000 + _n egen pfolio = group(porfolio_id) expand runiform() * 360 + 2 sort pfolio by pfolio: gen propid = _n sort pfolio propid gen double lat = runiform() gen double lon = runiform() tempfile main save "`main'" * brute force approach rename (propid lat lon) (propid0 lat0 lon0) joinby pfolio using "`main'" drop if propid == propid0 isid pfolio propid propid0, sort geodist lat lon lat0 lon0, gen(km_brute) sphere sort pfolio propid km_brute propid0 by pfolio propid: keep if _n == 1 tempfile brute save "`brute'" * using -geonear- from SSC use "`main'", clear sum pfolio, meanonly local npid = r(max) tempfile nbors qui forvalues i = 1/`npid' { use if pfolio == `i' using "`main'", clear save "`nbors'", replace geonear propid lat lon using "`nbors'", /// n(propid lat lon) ignore tempfile res`i' save "`res`i''" } clear forvalues i = 1/`npid' { append using "`res`i''" } merge 1:1 pfolio propid using "`brute'", nogen assert nid == propid0 assert abs(km_brute - km_to_nid) < 1e-12 * --------------- end example ----------------------------- On Tue, Sep 10, 2013 at 9:49 PM, Joe Canner <jcanner1@jhmi.edu> wrote: > McKay, > > Take a look at -joinby-. You will probably have to create a duplicate copy of your dataset and rename the property_id, lat, and lon variables in the duplicated data set. Then do: > > . use original.dta > . joinby portfolio_id using duplicate.dta > . geodist lat lon duplat duplon > > (Warning: this will create about 6.5 million records.) > > Regards, > Joe > ________________________________________ > From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of S. McKay Price [smp210@lehigh.edu] > Sent: Tuesday, September 10, 2013 6:28 PM > To: statalist@hsphsun2.harvard.edu > Subject: st: Calculate Distance between Properties within Portfolios > > Hello, > > I'm trying to calculate the distance, in miles or kilometers, between > all possible pairwise combinations of properties within a given > portfolio. Is there an efficient way to structure the data to > accomplish this? > > My data include numerous portfolios (roughly 200), each with a unique > portfolio identifier (portfolio_id). And, there are multiple properties > within each portfolio (180 on average), where each property has a unique > property identifier (property_id). I have latitude and longitude > coordinates in decimal form for each property (e.g. 42.270873 > -83.726329) for use in a command such as -geodist- from SSC, or > something similar. The data are organized as follows: > > portfolio_id property_id latitude longitude > 1 1 lat lon > 1 2 lat lon > 1 3 lat lon > ... > 2 1 lat lon > 2 2 lat lon > 2 3 lat lon > etc... > > Any suggestions? Thank you for your consideration. > > McKay > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/