Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: RE: st: _rmcoll query


From   "Salvati, Jean" <[email protected]>
To   <[email protected]>
Subject   RE: RE: st: _rmcoll query
Date   Tue, 8 Mar 2005 11:07:33 -0500

Bill,

Thanks a lot for the detailed reply. I'll have to read more about the
sweep operator.

Jean Salvati
 

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> William Gould, Stata
> Sent: Tuesday, March 08, 2005 10:21 AM
> To: [email protected]
> Subject: Re: RE: st: _rmcoll query
> 
> Jean Salvati <[email protected]> wants to know more about how 
> syminv() works:
> 
> > Specifically, I don't understand how looking for zeroes on the 
> > diagonal of
> > syminv(X'X) allows you to determine which variables cause 
> singularity 
> > problems.
> 
> The inverse of matrix A is defined as the matrix X solving 
> the equation
> 
>         A*X = I                                                (1)
> 
> where I is the identity matrix.  Some matrices are singular, 
> meaning there is no solution to (1).  In that case, it is 
> often sufficient to obtain a generalized inverse, defined as 
> matrix X meeting the restrictions
> 
>         A*X*A = A
>                                                                (2)
>         X*A*X = X
> 
> When A is singular, (2) is not sufficient to identify 
> uniquely an X -- there are lots of different X's that 
> suffice.  For most problems, one X will work as well as another.
> 
> syminv() produces generalized inverses for symmetric, 
> positive definite matrices, and chooses among the X's the X 
> that minimizes roundoff error given its approach, its 
> approach being to ignore rows and columns that look collinear 
> to it.  Ignoring rows (columns) amounts to dropping variables.  When
> syminv() decides to ignore a row (column), it substitutes 0 
> for the row and column in the result.  Thus, the easy way to 
> determine whether syminv() ignored a row (column) is to check 
> the diagonal element for zero.
> 
> That makes it easy to calculate the rank.  If 
> 
>                           +-            -+
>                           |   1   0   0  |
>         A * syminv(A)  =  |   0   1   0  |
>                           |   0   0   1  |
>                           +-            -+
> 
> then A is full rank.  If 
> 
>                           +-            -+
>                           |   1   0   0  |
>         A * syminv(A)  =  |  -1   0   1  |
>                           |   0   0   1  |
>                           +-            -+
> 
> then A has rank 2.  More correctly, we should say that 
> syminv() acted as if A had rank 2.  If we tightened down one 
> some of syminv()'s secret tolerances, we could probably 
> convince syminv() to treat A as if it had full rank.
> 
> Substituting zeros in the result also means that if you 
> calculate A*syminv(A), you can read the linear dependencies 
> from the rows of the result.  For instance, if you got
> 
>                           +-            -+
>                           |   1   0   0  |
>         A * syminv(A)  =  |  -1   0   1  |
>                           |   0   0   1  |
>                           +-            -+
> 
> then that means that 
> 
>                    x2  =  -x1 + x3
> 
> 
> > Do you use LU or QR decomposition?
> 
> No.  LU is a popular and accurate way to to obtain inverses 
> of square matrices, but it does not exploit symmetry (i.e., 
> LU can calculate the inverse of nonsymmetric matrices).  The 
> more numerically intensive QR is even more general, in that 
> it can obtain inverses of nonsquare matrices (see equation
> (2) above).
> 
> syminv() is implemented in terms of a sweep operator.  Some 
> researchers think the sweep operator is not adequately 
> accurate, but the sweep can be made accurate if one puts the 
> same effort in its development and implementation as, say, LU 
> decomposition.
> 
> What is sometimes taught as LU decomposition is not the LU 
> decomposition actually used.  LU decomposition is often 
> taught as finding L and U such that A = LU, where L is lower 
> triangular and U and upper triangular.
> Numerically speaking, however, finding L and U such that A = 
> LU cannot be done very accurately, and so what is actually 
> used is A = PLU, where P is a permutation matrix.  The 
> reordering performed by P makes all the difference, 
> accuracywise, but it complicates how one uses the results 
> from the decomposition.
> 
> Similarly, the sweep operator, performed in its simplest 
> form, is inaccurate.
> There are two issues, scaling and ordering, the latter being 
> similar to the reordering performed by P in LU decomposition. 
>  To handle scaling, syminv() scales the original matrix by 
> the median of the diagonal.  As for reordering, that cannot 
> be handled by pivot methods (permuation matrices) without 
> losing the symmetric structure of the problem.  The 
> counterpart is to determine the order in which the rows are 
> to be swept rather than sweeping them in order in which they 
> appear, and syminv() makes calculations concerning the 
> roundoff error that are very similar to those made in constructing P.
> 
> -- Bill
> [email protected]
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index