Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Mata - extracting various vectors of different sizes in one loop
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Mata - extracting various vectors of different sizes in one loop
Date
Thu, 4 Apr 2013 16:20:53 +0100
One possibility is that you just maintain two longish vectors in Mata,
one keeping tracking of some identifier and the other the distance
concerned. Naturally, you still need to set up those vectors. Then use
-reshape- in Stata to get the data structure you want (whether it's
the data structure you really need is not obvious).
A more specific suggestion that's part of the folklore is to observe
that taking a square root is not needed for _all_ Pythagorean distance
calculations. You can select on the _squared_ distance being less than
your distance threshold squared and then (later) square root only the
distances you care about. That may be a twentieth-century trick that
would only speed up calculations trivially, but often people do this
with fairly large datasets, and you need to compare every place with
every other, so it's worth thinking about.
Nick
[email protected]
On 4 April 2013 11:26, nick bungy <[email protected]> wrote:
> I have a mata code that cycles through grid references (eastings, northings) of x entities and calculates for each entity all the other entities which are within a 10km radius of it.
> So each individual entity has a row vector, with dimensions anywhere between 1 row (1 firm within 10km radius) and ~80 rows (80 firms within 10km radius). This is throwing up conformity errors when I try to store these vectors into a selection of ~80 variables in Stata.
> My thought was to artifically inflate all row vectors to say 100 and fill all of the extra cells in each row vector with 0, then I can extract to 100 variables without conformity errors. I can then clean this up quite easily using Stata functions. I'm not quite sure how to go about this though.
> My mata code is the following:
>
> mata:
> geoeasta = st_data(., "Geoeast")
> geonortha = st_data(., "Geonorth")
> n = rows(geoeasta)
>
> density = .
> densitytwo = .
> densitythree = .
> dups = .
>
>
> for(i=1; i<=n; ++i) {
>
> d = sqrt((geoeasta:-geoeasta[i]):^2 + (geonortha:-geonortha[i]):^2)
> d[i] = .
> density = select(d, d[.,1]:<10000)
> minindex(density, 80, densitytwo, dups)
> st_store(i, ("MSOA1", "MSOA2", "MSOA3" etc etc.), densitytwo)
> //This stores the nearest neighbours into our variables, which we defined at the top.
>
> }
>
>
> end
> I suspect I need a line or two below minidex, which inflates densitytwo to a 100 row vector and fills all the extra rows generated with 0. Or perhaps there is a more elegent way?
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/