Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Mata - generalised cross products of the form X'SX
From
David Kantor <[email protected]>
To
[email protected]
Subject
Re: st: Mata - generalised cross products of the form X'SX
Date
Tue, 08 Feb 2011 09:29:54 -0500
At 07:12 AM 2/8/2011, Gordon Hughes wrote:
I am starting a new thread, but this is really a follow-up to my
post on performance monitoring. Following Bill Gould's helpful
suggestion I have identified that most of the time is consumed in
loops in my likelihood evaluation that calculate matrices of the
form Z=X'SX where S is symmetric but not a diagonal matrix. Of
course, I am doing things outside loops where I can.
This is a standard form in any GLS type calculation. They are not
directly amenable to use with the Mata function --cross()-- but they
could be got into that form by forming by factorising S=QQ', then
forming P=Q'X, and finally using P=cross(P,P).
Two questions follow:
A. Is there any Mata function that does this more directly? Maybe
Stata Corp might consider extending --cross()-- to handle such
cases. The command --matrix glsaccum-- does this in Stata.
B. Roughly, under what conditions might this sequence of steps
reduce execution time and/or memory use on the assumption that the
dimension of S is large relative to cols(X)? In a related context I
have to calculate X'B'BX where B is square but not symmetric and
found that use of --cross()-- does not save much (if any) time,
though it is probably more efficient on memory. Hence, the overhead
of factorisation may not pay off. On the other hand, the initial
factorisation would only need to be done once per function call,
whereas the steps involving X have to be performed thousands of
times per function call.
Gordon Hughes
[email protected]
I don't know enough to answer your question, but the situation looks
similar to something I encountered some time ago, which may be
valuable to consider. This is about plain Stata code, not Mata.
This is in my mahascore.ado, as part of the mahapick module on SSC.
For each observation, I wanted a value computed by first forming a
vector `X', based on values of specified variables in the
observation, and then computing...
`X'' * `invcov' * `X' `X'
The "obvious" thing to do was to do that exact computation in a loop,
looping over the observations: (Code samples omit the matter of
taking the sqrt of the result. Also, these are excerpts that refer to
things defined in sections of omitted code.)
local vref "\`v'[\`refobs']"
quietly gen double `gen' = .
tempname X Y
forvalues obsno = 1/`=_N' {
matrix `X' = J(`nvars', 1, .)
matrix rownames `X' = `varlist'
foreach v of local varlist {
local rownum = rownumb(matrix(`X'), "`v'")
matrix `X'[`rownum', 1] = `v'[`obsno'] - `vref'
}
matrix `Y' = `X'' * `invcov' * `X' // `Y' is 1x1
quietly replace `gen' = (scalar(`Y'[1,1])) in `obsno'
}
----
Later, I learned of a much quicker way to get the same result. This
is based on the corresponding calculation found in _Dif_mbase in
psmatch2 by Leuven & Sianesi. Instead of looping over the
observations, the looping is over the rows and columns of `invcov':
local rows : rownames `invcov'
local cols : colnames `invcov'
local icrows = rowsof(`invcov')
local iccols = colsof(`invcov')
local vjref "\`vj'[\`refobs']"
local vkref "\`vk'[\`refobs']"
quietly gen double `gen' = 0
forvalues j = 1 / `icrows' {
forvalues k = 1 / `icrows' { /* or iccols -- the same */
local vj = word("`rows'", `j')
local vk = word("`rows'", `k') /* or cols -- should be the same */
quietly replace `gen' = `gen' + ///
`invcov'[`j,', `k'] * (`vj' - `vjref') * (`vk' - `vkref')
}
}
----
The latter is mathematically equivalent to the former, but much
faster. It gave results that are almost identical to the former; the
differences were minuscule. I can't say which was closer to the true
result, but I suspect it was the latter -- a question that is similar
to a matter that Bill Gould recently wrote about; he may want to
chime in here as well.
I hope this is useful or interesting.
--David
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/