Bill,
Thanks a lot for the detailed reply. I'll have to read more about the
sweep operator.
Jean Salvati
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> William Gould, Stata
> Sent: Tuesday, March 08, 2005 10:21 AM
> To: [email protected]
> Subject: Re: RE: st: _rmcoll query
>
> Jean Salvati <[email protected]> wants to know more about how
> syminv() works:
>
> > Specifically, I don't understand how looking for zeroes on the
> > diagonal of
> > syminv(X'X) allows you to determine which variables cause
> singularity
> > problems.
>
> The inverse of matrix A is defined as the matrix X solving
> the equation
>
> A*X = I (1)
>
> where I is the identity matrix. Some matrices are singular,
> meaning there is no solution to (1). In that case, it is
> often sufficient to obtain a generalized inverse, defined as
> matrix X meeting the restrictions
>
> A*X*A = A
> (2)
> X*A*X = X
>
> When A is singular, (2) is not sufficient to identify
> uniquely an X -- there are lots of different X's that
> suffice. For most problems, one X will work as well as another.
>
> syminv() produces generalized inverses for symmetric,
> positive definite matrices, and chooses among the X's the X
> that minimizes roundoff error given its approach, its
> approach being to ignore rows and columns that look collinear
> to it. Ignoring rows (columns) amounts to dropping variables. When
> syminv() decides to ignore a row (column), it substitutes 0
> for the row and column in the result. Thus, the easy way to
> determine whether syminv() ignored a row (column) is to check
> the diagonal element for zero.
>
> That makes it easy to calculate the rank. If
>
> +- -+
> | 1 0 0 |
> A * syminv(A) = | 0 1 0 |
> | 0 0 1 |
> +- -+
>
> then A is full rank. If
>
> +- -+
> | 1 0 0 |
> A * syminv(A) = | -1 0 1 |
> | 0 0 1 |
> +- -+
>
> then A has rank 2. More correctly, we should say that
> syminv() acted as if A had rank 2. If we tightened down one
> some of syminv()'s secret tolerances, we could probably
> convince syminv() to treat A as if it had full rank.
>
> Substituting zeros in the result also means that if you
> calculate A*syminv(A), you can read the linear dependencies
> from the rows of the result. For instance, if you got
>
> +- -+
> | 1 0 0 |
> A * syminv(A) = | -1 0 1 |
> | 0 0 1 |
> +- -+
>
> then that means that
>
> x2 = -x1 + x3
>
>
> > Do you use LU or QR decomposition?
>
> No. LU is a popular and accurate way to to obtain inverses
> of square matrices, but it does not exploit symmetry (i.e.,
> LU can calculate the inverse of nonsymmetric matrices). The
> more numerically intensive QR is even more general, in that
> it can obtain inverses of nonsquare matrices (see equation
> (2) above).
>
> syminv() is implemented in terms of a sweep operator. Some
> researchers think the sweep operator is not adequately
> accurate, but the sweep can be made accurate if one puts the
> same effort in its development and implementation as, say, LU
> decomposition.
>
> What is sometimes taught as LU decomposition is not the LU
> decomposition actually used. LU decomposition is often
> taught as finding L and U such that A = LU, where L is lower
> triangular and U and upper triangular.
> Numerically speaking, however, finding L and U such that A =
> LU cannot be done very accurately, and so what is actually
> used is A = PLU, where P is a permutation matrix. The
> reordering performed by P makes all the difference,
> accuracywise, but it complicates how one uses the results
> from the decomposition.
>
> Similarly, the sweep operator, performed in its simplest
> form, is inaccurate.
> There are two issues, scaling and ordering, the latter being
> similar to the reordering performed by P in LU decomposition.
> To handle scaling, syminv() scales the original matrix by
> the median of the diagonal. As for reordering, that cannot
> be handled by pivot methods (permuation matrices) without
> losing the symmetric structure of the problem. The
> counterpart is to determine the order in which the rows are
> to be swept rather than sweeping them in order in which they
> appear, and syminv() makes calculations concerning the
> roundoff error that are very similar to those made in constructing P.
>
> -- Bill
> [email protected]
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/