Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Data set too large for spatwmat
From
henrik andersson <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: Data set too large for spatwmat
Date
Wed, 1 Dec 2010 13:10:27 +0000
Dear Austin,
sorry for not describing the user-written software correctly.
I have now tested your spwm2 and checked the results. It works very well and solves the problem.
Thank you very much!
Henrik
> henrik andersson <[email protected]>:
> You are requested to cite the source of user-written software:
> -spatwmat- is from STB-60 and avail via -findit-.
> The limits look to be due to the use of -svmat- and -mkmat- so one way
> forward is to Mata-ize the program.
> Try saving the following to a file spwm2.ado (compare to the original
> ado file or just look for lines that start mata: to see the main
> changes),
> and check that you get the same results in the smaller dataset (I
> tested this exactly once):
>
> *! -spwm2- 1.0 - 30 Nov 2010 [email protected] posted to
> Statalist
> * based on
> * Version 1.0 - 29 January 2001 STB-60 sg162
> * -spatwmat- Generates different kinds of spatial weights matrices
> * Author: Maurizio Pisati
> * Department of Sociology and Social Research
> * University of Milano Bicocca (Italy)
> * [email protected]
> prog spwm2
> version 9.2
> syntax [using/], Name(string) /*
> */ [DROP(numlist min=1 >=0 sort)] /*
> */ [Xcoord(varname numeric)] /*
> */ [Ycoord(varname numeric)] /*
> */ [Band(numlist min=2 max=2 >=0 sort)] /*
> */ [Friction(real 1)] /*
> */ [BINary] /*
> */ [Standardize] /*
> */ [Eigenval(string)]
> confirm name `name'
> tempname D W V L R
> if "`using'"=="" & ("`xcoord'"=="" | "`ycoord'"=="") {
> di as err "You must specify both x- and y-coordinates using options "
> di as err "{bf:{ul:x}coord({it:varname})} and " _c
> di as err "{bf:{ul:y}coord({it:varname})}"
> exit
> }
> if "`using'"=="" & "`band'"=="" {
> di as err "You must specify distance band using option " _c
> di as err "{bf:{ul:b}and({it:numlist})}"
> exit
> }
> local OUTPUT "The following matrix has been created:"
> if "`using'"!="" {
> preserve
> qui use `"`using'"', clear
> /* Drop rows and columns if requested */
> if "`drop'"!="" {
> local NDROP : word count `drop'
> unab VLIST : _all
> qui generate RDROP=0
> local i=1
> while `i'<=`NDROP' {
> local D : word `i' of `drop'
> local VAR : word `D' of `VLIST'
> local CDLIST "`CDLIST'`VAR' "
> qui replace RDROP=1 in `D'
> local i=`i'+1
> }
> qui drop `CDLIST'
> qui drop if RDROP
> qui drop RDROP
> }
> /* Check if weights are binary */
> unab VLIST : _all
> local NVAR : word count `VLIST'
> local SUM=0
> local i=1
> while `i'<=`NVAR' {
> local VAR : word `i' of `VLIST'
> qui capture assert `VAR'==0 | `VAR'==1
> if _rc!=0 {
> local SUM=`SUM'+1
> }
> local i=`i'+1
> }
> if `SUM'==0 {
> local binary "binary"
> }
> else {
> local binary ""
> }
> /* Check if each location has at least one neighbor */
> qui egen ROWSUM=rsum(_all)
> qui count if ROWSUM==0
> local NN=r(N)
> qui drop ROWSUM
> /* Create intermediate matrix `W' */
> qui mkmat _all, matrix(`W')
> restore
> /* Check if matrix is square*/
> local NROW=rowsof(`W')
> local NCOL=colsof(`W')
> if `NROW'!=`NCOL' {
> di as err "Matrix is not square"
> exit
> }
> local N=`NROW'
> /* Create labels */
> if "`binary'"!="" {
> local WT "Imported binary weights matrix"
> }
> else {
> local WT "Imported non-binary weights matrix"
> }
> /* Create final matrix */
> matrix `name'=`W'
> }
> *********
> * 5. Create distance-based weights matrix
> *********
> if `"`using'"'=="" {
> /* Define distance band */
> local LOWER : word 1 of `band'
> local UPPER : word 2 of `band'
> /* Check appropriateness of coordinate variables */
> capture qui assert `xcoord'!=.
> if _rc!=0 {
> di as err "Variable `xcoord' has missing values"
> exit
> }
> capture qui assert `ycoord'!=.
> if _rc!=0 {
> di as err "Variable `ycoord' has missing values"
> exit
> }
> local N=_N
> /* Create intermediate matrix */
> matrix `W'=J(`N',`N',0)
> matrix `D'=J(`N',`N',0)
> local MAXOBS=(`N'/2)*(`N'-1)
> local d=1
> local i=1
> while `i'<=`N' {
> local j=`i'+1
> while `j'<=`N' {
> local A=(`xcoord'[`i']-`xcoord'[`j'])^2
> local B=(`ycoord'[`i']-`ycoord'[`j'])^2
> local DIST=sqrt(`A'+`B')
> matrix `D'[`i',`j']=`DIST'
> matrix `D'[`j',`i']=`DIST'
> if `DIST'>`LOWER' & `DIST'<=`UPPER' {
> if "`binary'"!="" {
> matrix `W'[`i',`j']=1
> matrix `W'[`j',`i']=1
> }
> else {
> matrix `W'[`i',`j']=1/(`DIST'^`friction')
> matrix `W'[`j',`i']=1/(`DIST'^`friction')
> }
> }
> local d=`d'+1
> local j=`j'+1
> }
> local i=`i'+1
> }
> mata:`W'=st_matrix("`W'")
> mata:`D'=st_matrix("`D'")
> /* Generate distance statistics */
> mata:st_local("MAXMIN",strofreal(colmin(rowmax(`D'))))
> mata:st_local("MINMAX",strofreal(colmax(rowmin(`D'))))
> /* Check if each location has at least one neighbor */
> mata:st_local("NN",strofreal(colsum(rowsum(`W'):==0)))
>
> /* Create labels */
> if "`binary'"!="" {
> local WT "Distance-based binary weights matrix"
> }
> else {
> local WT "Inverse distance weights matrix"
> }
>
> /* Create final matrix */
> matrix `name'=`W'
> }
> *********
> * 6. Row-standardize weights matrix
> *********
> if "`standardize'"!="" {
> mata:st_matrix("`W'",`W':/(rowsum(`W')+(rowsum(`W'):==0)))
> }
> *********
> * 7. Create weights matrix eigenvalues
> *********
> if "`eigenval'"!="" & `NN'>0 {
> di as err "Eigenvalues matrix cannot be computed because of the
> presence"
> di as err "of one or more locations with no neighbors"
> }
> if "`eigenval'"!="" & `NN'==0 {
> mata:`L'=.
> mata:`V'=.
> mata:`R'=diag(rowsum(`W'):^(-1/2))
> mata:eigensystem(`R'*`W'*`R',`V',`L')
> mata:st_matrix("`eigenval'",sort(Re(`L''),-1))
> local OUTPUT "The following matrices have been created:"
> matrix `name'=`W'
> }
> *********
> * 8. Add relevant info to weights matrix
> *********
> if "`using'"!="" & "`binary'"!="" local ROW="SWMImpo Yes "
> if "`using'"!="" & "`binary'"=="" local ROW="SWMImpo No "
> if "`using'"=="" & "`binary'"!="" local ROW="SWMDist Yes "
> if "`using'"=="" & "`binary'"=="" local ROW="SWMDist No "
> if "`standardize'"!="" {
> local ROW="`ROW'Yes"
> }
> else {
> local ROW="`ROW'No"
> }
> matrix rownames `name'=`ROW'
> if "`using'"=="" {
> local INT=int(`LOWER')
> local DEC=`LOWER'-`INT'
> local DEC=string(`DEC')
> local COL "`INT' `DEC'"
> local INT=int(`UPPER')
> local DEC=`UPPER'-`INT'
> local DEC=string(`DEC')
> local COL "`COL' `INT' `DEC'"
> matrix colnames `name'=`COL'
> }
> *********
> * 9. Display report
> *********
> if "`standardize'"!="" {
> local S "(row-standardized)"
> }
> di _newline
> di as txt "`OUTPUT'"
> di ""
> di as txt "1. `WT' " as res "`name'" as txt " `S'"
> di as txt " Dimension: " as res "`N'x`N'"
> if "`using'"=="" {
> di as txt " Distance band: " as res "`LOWER' < d <= `UPPER'"
> di as txt " Friction parameter: " as res "`friction'"
> di as txt " Largest minimum distance: " %-9.2f as res `MAXMIN'
> di as txt " Smallest maximum distance: " %-9.2f as res `MINMAX'
> }
> if `NN'==1 {
> di ""
> di as err " Beware! `NN' location has no neighbors"
> }
> else if `NN'>1 {
> di ""
> di as err " Beware! `NN' locations have no neighbors"
> }
> if `NN'>0 & "`using'"=="" {
> di as err " You are advised to extend the distance band"
> }
> if "`eigenval'"!="" & `NN'==0 {
> di ""
> di as txt "2. Eigenvalues matrix " as res "`eigenval'"
> di as txt " Dimension: " as res "`N'x1"
> }
> di _newline
> *********
> * 10. End program
> *********
> capture matrix drop `W'
> capture matrix drop `W'S
> end
>
>
> On Tue, Nov 30, 2010 at 11:40 AM, henrik andersson
> <[email protected]> wrote:
> > Hi,
> >
> > I'm trying to create a spatial weight matrix using spatwmat for a
> data set containing 3594 observations, but the matrix cannot be
> created. I've tested creating the weight matrix based on different
> subsets of my data set and the matrix can be created when I have 2364
> observation, but not when I have 2798. I've therefore concluded that my
> data set is too large for the memory that I can allocate to Stata.
> >
> > I use Stata/MP 10.1 and spatwmat version 1.0.
> >
> > I'm able to allocate 30 GB to the memory and I have also maximized
> the number of variables and the matrix size.
> >
> > Current memory allocation
> >
> > current memory
> usage
> > settable value description (1M =
> 1024k)
> > ------------------------------------------------------------------
> --
> > set maxvar 32767 max. variables allowed
> 12.751M
> > set memory 30720M max. data space
> 30,720.000M
> > set matsize 11000 max. RHS vars in models
> 924.080M
> > ----------
> -
> >
> 31,656.831M
> >
> > When trying to create the matrix I get the following result:
> >
> > . spatwmat, name(W50_1) xcoord(y_rt90x) ycoord(x_rt90y) band(0 15000)
> standardize eigenval(eigen50_1)
> > no room to add more variables
> > r(902);
> >
> > I have three questions:
> >
> > 1. Is there any possiblity to use spatwmat more efficiently than I
> do, i.e. to use the memory more efficiently and thereby to be able to
> run it on the whole sample.
> > 2. One possibility when using spatwmat is to import the weight matrix
> from a file. If I cannot create my matrix using Stata do anyone have
> any suggestion about what program to use to create this matrix,
> preferrably a program that creates the weight matrix in an identical
> way as spatwmat. (I of course prefer to run all my estimations in
> Stata.)
> >
> > Thanks in advance
> >
> > Henrik
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/