Antoine Terracol <[email protected]> asks,
> I am trying to do matrix calculations in Mata in a very large data set
> (around 3 million observations and roughly 25 relevant variables). My
> operating system won't allow me to allocate more than 930M of memory (I have
> 1gb) [...]
>
> The first problem is when I do a calculation of the sort "x=cross(x',u)"
> with x (3,188,029 by 22) being a view on a dataset with no extraneous
> variables and u being a small (22 by 22) matrix, I get the message "<istmt>:
> 3900 unable to allocate real <tmp>[22,3188029]" [...]
Mata is running out of memory in the calculation of x'. x is stored as a view
matrix, but x' is not. One of the purposes of cross(a,b), which calculates
a'b, is to avoid the transposition of a.
In Antoine's case, I recommend he substitute
x = x*u
Even so, he may run into problems. The above statement will allocate
a 3,188,029 x 22 matrix to store the result, which requires 535M.
> Second, I am again using a view onto a (now smaller) data set with 3
> variables and 3,188,029 observations and need to do some row-by-row
> calculations with another 3,188,029 by 22 matrix that I stored previously
> (the one at the origin of problem 1). When I try to fgetmatrix the stored
> matrix, I get a similar error message, namely "fgetmatrix(): 3900 unable to
> allocate real <tmp>[3188029,22]". [...]
In the first step, Antoine stored a view matrix, but in reading it back,
it is not read as a view -- it cannot be, because something else is in
memoroy -- and thus Stata reads it as a regular 3,188,028 x 22 matrix.
Evidently, Mata could not lay its hands on the required 535M.
If Antoine cannot obtain more memory, then my only suggestion is to
write code to process the matrices in parts.
In the first case, x = x*u, if we write x = (x1 \ x2), then
x*u = (x1*u \ x2*u). The multiplication could be maid in separate runs
and the results combined.
In the second case, we have two large matrices, and we can write
| x1 | | x1*z1' x1*z2' |
| | (z1' z2') = | |
| x2 | | x2*z1' x2*z2' |
and that could be done in four seprate runs.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/