Moore-Penrose pseudoinverse

Moore-Penrose pseudoinverse

In mathematics, and in particular linear algebra, the pseudoinverse A^+ of an m imes n matrix A is a generalization of the inverse matrix.cite book | last=Ben-Israel | first = Adi | coauthors=Thomas N.E. Greville | title=Generalized Inverses | isbn=0-387-00293-6 | publisher=Springer-Verlag | year=2003] More precisely, this article talks about the Moore-Penrose pseudoinverse, which was independently described by E. H. Moorecite journal | last=Moore | first=E. H. | authorlink=E. H. Moore | title=On the reciprocal of the general algebraic matrix | journal=Bulletin of the American Mathematical Society | volume=26 | pages=394–395 | year=1920] in 1920 and Roger Penrosecite journal | last=Penrose | first=Roger | authorlink=Roger Penrose | title=A generalized inverse for matrices | journal=Proceedings of the Cambridge Philosophical Society | volume=51 | pages=406–413 | year=1955] in 1955. Earlier, Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. The term generalized inverse is sometimes used as a synonym for pseudoinverse.

A common use of the pseudoinverse is to compute a 'best fit' (least squares) solution to a system of linear equations (see below under Applications). The pseudoinverse is defined and unique for all matrices whose entries are real or complex numbers. The pseudoinverse can be computed using the singular value decomposition.

Definition

The pseudoinverse A^+ of an "m"-by-"n" matrix A (whose entries can be real or complex numbers) is defined as the unique "n"-by-"m" matrix satisfying all of the following four criteria:cite book | last=Golub | first=Gene H. | authorlink=Gene H. Golub | coauthors=Charles F. Van Loan | title=Matrix computations | edition=3rd edition | publisher=Johns Hopkins | location=Baltimore | year=1996 | isbn=0-8018-5414-8 | pages = 257–258]
# A A^+A = A (A A^+ need not be the general identity matrix, but it maps all column vectors of A to themselves);
# A^+A A^+ = A^+ (A^+ is a weak inverse for the multiplicative semigroup);
# (AA^+)^* = AA^+ (AA^+ is Hermitian); and
# (A^+A)^* = A^+A (A^+A is also Hermitian).

Here M^* is the conjugate transpose of a matrix "M". For matrices whose elements are real numbers instead of complex numbers, M^* = M^T.

Properties

Proofs for some of these relations may be found on the proofs page.

* The pseudoinverse exists and is unique: for any matrix A, there is precisely one matrix A^+ that satisfies the four properties of the definition.
* If the matrix A is invertible, the pseudoinverse and the inverse coincide: A^+=A^{-1}.Citation | last1=Stoer | first1=Josef | last2=Bulirsch | first2=Roland | title=Introduction to Numerical Analysis | publisher=Springer-Verlag | location=Berlin, New York | edition=3rd | isbn=978-0-387-95452-3 | year=2002.] rp|243
* Pseudoinversion is reversible. It is its own inverse: (A^+)^+=A.rp|245
* AA^+ is the orthogonal projector on the range of A, and A^+A is the orthogonal projector on the range of A^*.
* The pseudoinverse of a zero matrix is its transpose.
* The pseudoinverse can be computed via a limiting process::A^+ = lim_{delta o 0} (A^* A + delta I)^{-1} A^* = lim_{delta o 0} A^* (A A^* + delta I)^{-1}:(see Tikhonov regularization). These limits exist even if (A A^*)^{-1} and (A^*A)^{-1} do not exist.rp|263
* Pseudoinversion commutes with transposition, conjugation, and taking the conjugate transpose:rp|245 ::egin{align}(A^T)^+ &= (A^+)^T, \overline{A}^+ &= overline{A^+}, \(A^*)^+ &= (A^+)^*.end{align}
* The pseudoinverse of a scalar multiple of A is the reciprocal multiple of A^+:::(alpha A)^+ = alpha^{-1} A^+ for alpha eq 0.
* If the pseudoinverse of A^*A is already known, it may be used to compute A^+:::A^+ = (A^*A)^+A^*.
* Likewise, if (AA^*)^+ is already known:::A^+ = A^*(AA^*)^+ .

If "A" and "B" are such that the product AB is defined and
* "A" has orthonormal columns (A^* A = I, unitarity is a special case), or
* "B" has orthonormal rows (B B^* = I), or
* "A" is of full column rank, and "B" is of full row rank,then:(AB)^+ = B^+ A^+.

The third case does not cover the first two;an orthonormal (including non-square) matrix must be of full rank, but otherwise there is no assumption made on the matrix it multiplies.

Identity transformations

These relations have in common that the right hand side equals the left hand side multiplied with some expression. They can be used to cancel certain subexpressions or expand expressions. ::egin{array}{lcllll}A^+ &=& A^+ & A^{+*} & A^* &. \A^+ &=& A^* & A^{+*} & A^+ &. \A &=& A^{+*}& A^* & A &. \A &=& A & A^* & A^{+*}&. \A^* &=& A^* & A & A^+ &. \A^* &=& A^+ & A & A^* &. \end{array}

Special cases

Orthonormal columns or rows

If "A" has orthonormal columns (A^* A = I)or orthonormal rows (A A^* = I),then A^+ = A^*.

Full rank

If the columns of A are linearly independent, then A^* A is invertible. In this case, an explicit formula is::A^+ = (A^* A)^{-1} A^*.It follows that A^+ is a left inverse of "A": A^+ A = I.

If the rows of A are linearly independent, then A A^* is invertible. In this case, an explicit formula is::A^+ = A^*(A A^*)^{-1}.It follows that A^+ is a right inverse of "A": A A^+ = I.

If both columns and rows are linearly independent (that is, for square regular matrices), the pseudoinverse is just the inverse::A^+ = A^{-1}.

Scalars and vectors

It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar x is zero if x is zero and the reciprocal of x otherwise::x^+ = left{egin{matrix} 0, & mbox{if }x=0; \ x^{-1}, & mbox{otherwise}. end{matrix} ight.

The pseudoinverse of the null vector is the transposed null vector. The pseudoinverse of a non-null vector is the conjugate transposed vector divided by its squared magnitude::x^+ = left{egin{matrix} 0^T, & mbox{if }x = 0; \ {x^* over x^* x}, & mbox{otherwise}. end{matrix} ight.

For a proof, simply check that these definitions meet the defining criteria for the pseudoinverse.

Circulant matrices

For a Circulant matrix C the singular value decomposition is given by the Fourier transform,that is the singular values are the Fourier coefficients.Let mathcal{F} be the Fourier matrix, then: C = mathcal{F}cdotSigmacdotmathcal{F}^* : C^+ = mathcal{F}cdotSigma^+cdotmathcal{F}^* cite journal | last=Stallings | first=W. T. | authorlink=W. T. Stallings | title=The Pseudoinverse of an r-Circulant Matrix | journal=Proceedings of the American Mathematical Society | volume=34 | pages=385–388 | year=1972 | doi=10.2307/2038377]

Block matrices

Optimized approaches exist for calculating the pseudoinverse of block structured matrices.

Finding the pseudoinverse of a matrix

Using regular inverses

Let "k" be the rank of a m imes n matrix "A". Then "A" can be decomposed as A = BC, where "B" is a m imes k-matrix and "C" is a k imes n matrix. Then :A^+ = C^*(CC^*)^{-1}(B^*B)^{-1}B^*.

If "A" has full row rank, so that "k" = "m", then "B" can be chosen to be the identity matrix and the formula reduces to A^+ = A^*(AA^*)^{-1}. Similarly, if "A" has full column rank (that is, "k" = "n"), we have A^+ = (A^*A)^{-1}A^*.

The QR method

However, computing the product AA^* or A^*A or its inverse explicitly is unnecessary and only causes additional rounding errors and computational cost. Consider first the case when "A" is full column rank. Then all that is needed is the Cholesky decomposition :A^*A = R^*R,where "R" is an upper triangular matrix. Multiplication by the inverse is then done easily by solving a system with multiple right-hand-side,:M = (A^*A)^{-1}N qquad Leftrightarrow qquad (A^*A)M = N qquad Leftrightarrow qquad R^*RM = N by back substitution. To compute the Cholesky decomposition without forming A^*A explicitly, use the QR decomposition of "A",A = QRwhere "Q" is a unitary matrix, Q^*Q = QQ^* = I , and "R" is upper triangular. Then : A^*A ,=, (QR)^*(QR) ,=, R^*Q^*QR ,=, R^*IR ,=, R^*R,so "R" is the Cholesky factor of A^*A. The case of full row rank is obtained by swapping A and A^*.

The general case and the SVD method

A computationally simpler and more accurate way to get the pseudoinverse is by using the singular value decomposition. [http://www.uwlax.edu/faculty/will/svd/systems/index.html Linear Systems & Pseudo-Inverse] ] If A = USigma V^* is the singular value decomposition of "A", then A^+ = VSigma^+ U^*. For a diagonal matrix such as Sigma, we get the pseudoinverse by taking the reciprocal of each non-zero element on the diagonal, and leaving the zeros in place. In numerical computation, only elements larger than some small tolerance are taken to be nonzero, and the others are replaced by zeros. For example, in the Matlab function pinv, the tolerance is taken to be "t" = ε•max("m","n")•max(Σ), where ε is the machine epsilon.

The cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix-matrix multiplication, if a state-of-the art implementation (such as that of LAPACK) is used.

Updating the pseudoinverse

For the cases where "A" has full row or column rank, and the inverse of the correlation matrix (AA^* for "A" with full row rank or A^*A for full column rank) is already known, the pseudoinverse for matrices related to A can be computed by applying the Sherman–Morrison–Woodbury formula to update the inverse of the correlation matrix, which may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithmscite paper |author= Tino Gramß |title= Worterkennung mit einem künstlichen neuronalen Netzwerk |version= |publisher= Georg-August-Universität zu Göttingen |date = 1992 | url = | format = printed dissertation| accessdate = ] exist that exploit the relationship.

Similarly, it is possible to update the Cholesky factor when a row or column is added, without creating the inverse of the correlation matrix explicitly. However, updating the pseudoinverse in the general rank-deficient case is much more complicated. [Meyer, Carl D., Jr. Generalized inverses and ranks of block matrices. SIAM J. Appl. Math. 25 (1973), 597—602] [Meyer, Carl D., Jr. Generalized inversion of modified matrices. SIAM J. Appl. Math. 24 (1973), 315—323]

oftware libraries

High quality implementations of SVD, QR, and back substitution are available in standard libraries, such as LAPACK. Writing one's own implementation of SVD is a major programming project that requires a significant numerical expertise. In special circumstances, such as parallel computing or embedded computing, however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable.

Applications

The pseudoinverse provides a least squares solution to a system of linear equations.cite journal | last=Penrose | first=Roger | title=On best approximate solution of linear matrix equations | journal=Proceedings of the Cambridge Philosophical Society | volume=52 | pages=17–19 | year=1956]

Given a system of linear equations

:A x = b,

in general we cannot always expect to find a vector x which will solve the system; even if there exists such a solution vector, then it may not be unique. We can however always ask for a vector x that brings Ax "as close as possible" to b, i.e. a vector that minimizes the Euclidean norm

:|A x - b|^2.

If there are several such vectors x, we could ask for the one among them with the smallest Euclidean norm. Thus formulated, the problem has a unique solution, given by the pseudoinverse::x=A^+b.

Note that AA^+ is the orthogonal projection onto the image (range) of "A" (the space spanned by the column vectors of "A"), while (1 - A^+ A) is the orthogonal projection onto the kernel (null space) of "A".

References

External links

* [http://planetmath.org/encyclopedia/Pseudoinverse.html Pseudoinverse on PlanetMath]
*
*
*


Wikimedia Foundation. 2010.

См. также в других словарях:

  • Moore–Penrose pseudoinverse — In mathematics, and in particular linear algebra, a pseudoinverse A+ of a matrix A is a generalization of the inverse matrix.[1] The most widely known type of matrix pseudoinverse is the Moore–Penrose pseudoinverse, which was independently… …   Wikipedia

  • Moore-Penrose pseudoinverse/Proofs — Existence and Uniqueness =Let A be an m by n matrix over K , where K is either the field of real numbers or the field of complex numbers. Then there is a unique n by m matrix A^+ over K such that: #A A^+A = A #A^+A A^+ = A^+ #(AA^+)^* = AA^+… …   Wikipedia

  • Moore-Penrose-Inverse — Die Pseudoinverse einer Matrix ist ein Begriff aus dem mathematischen Teilgebiet lineare Algebra. Sie ist eine Verallgemeinerung der inversen Matrix auf singuläre und nichtquadratische Matrizen, weshalb sie häufig auch als verallgemeinerte… …   Deutsch Wikipedia

  • Pseudoinverse — Die Pseudoinverse einer Matrix ist ein Begriff aus dem mathematischen Teilgebiet lineare Algebra. Sie ist eine Verallgemeinerung der inversen Matrix auf singuläre und nichtquadratische Matrizen, weshalb sie häufig auch als verallgemeinerte… …   Deutsch Wikipedia

  • Pseudoinverse — Pseudo inverse En algèbre linéaire, la notion de pseudo inverse (ou inverse généralisé) généralise celle d’inverse d’une application linéaire ou d’une matrice[1] aux cas non inversibles en lui supprimant certaines des propriétés demandées aux… …   Wikipédia en Français

  • Roger Penrose — Infobox Scientist box width = 300px name = Sir Roger Penrose image size = 200px caption = Roger Penrose at Brookhaven Lab, February 6, 2007 birth date = birth date and age|1931|08|08 birth place = Colchester, Essex, England death date = death… …   Wikipedia

  • E. H. Moore — Eliakim Hastings Moore (* 28. Januar 1862 in Marietta, Ohio; † 30. Dezember 1932 in Chicago, Illinois) war ein US amerikanischer Mathematiker. Moore studierte ab 1879 an der Yale University Mathematik und Astronomie und promovierte 1885 mit der… …   Deutsch Wikipedia

  • Eliakim Hastings Moore — (* 28. Januar 1862 in Marietta, Ohio; † 30. Dezember 1932 in Chicago, Illinois) war ein US amerikanischer Mathematiker. Moore studierte ab 1879 an der Yale University Mathematik und Astronomie und promovierte …   Deutsch Wikipedia

  • Invertible matrix — In linear algebra an n by n (square) matrix A is called invertible (some authors use nonsingular or nondegenerate) if there exists an n by n matrix B such that where In denotes the n by n identity matrix and the multiplication used is ordinary… …   Wikipedia

  • Inverse element — In abstract algebra, the idea of an inverse element generalises the concept of a negation, in relation to addition, and a reciprocal, in relation to multiplication. The intuition is of an element that can undo the effect of combination with… …   Wikipedia


Поделиться ссылкой на выделенное

Прямая ссылка:
Нажмите правой клавишей мыши и выберите «Копировать ссылку»