The Moore-Penrose Pseudoinverse
The Need for a Generalized Inverse
When is not invertible — because it has a nontrivial null space, or its range is not all of — we need a generalized inverse that produces the best possible solution. The Moore–Penrose pseudoinverse provides exactly this: among all least-squares solutions, it selects the one with minimum norm.
In imaging, this is the natural starting point: find the smallest reconstruction consistent with the data. The trouble, as we shall see, is that for ill-posed problems the pseudoinverse is unbounded and therefore useless in the presence of noise. This failure motivates the entire regularization theory of Sections 2.3–2.6.
Definition: The Moore–Penrose Pseudoinverse
The Moore–Penrose Pseudoinverse
Let be a bounded linear operator between Hilbert spaces. The Moore–Penrose pseudoinverse is the (possibly unbounded) operator defined on by
Equivalently, is the minimum-norm least-squares solution: the element of smallest norm among all minimizers of .
For , is the unique element in satisfying .
The four Moore–Penrose conditions characterise uniquely (writing for matrices): (i) , (ii) , (iii) , (iv) .
Historical Note: Moore, Penrose, and the Generalized Inverse
1920–1955Eliakim Hastings Moore introduced a generalized inverse for finite matrices in 1920, motivated by problems in projective geometry and the calculus of variations. His work went largely unnoticed for decades.
Roger Penrose independently rediscovered and axiomatised the same concept in 1955, giving the four conditions that now bear both names. Penrose's algebraic characterisation made the pseudoinverse tractable for computation, and with the advent of the SVD algorithm in the 1960s, it became a standard tool in numerical linear algebra and statistics.
In the infinite-dimensional setting relevant to imaging, the unboundedness of was the key observation that drove Tikhonov's regularization theory — making the pseudoinverse both the motivation and the target that regularization approximates.
Theorem: SVD Representation of the Pseudoinverse
Let be a compact operator with singular system . Then for ,
This series converges if and only if the Picard condition holds:
The SVD decomposes the action of into independent channels: maps to . Inversion requires dividing by in each channel. The Picard condition says the data coefficients must decay faster than for the sum to converge. For exact data from a true solution, this holds; for noisy data, it generically fails.
Forward SVD expansion
Write where . Then
The minimum-norm condition forces (we choose ).
Inversion
Setting and taking inner products with :
Hence . Convergence in requires , which is the Picard condition.
Definition: The Picard Condition
The Picard Condition
Let be a compact operator with singular system . A datum satisfies the Picard condition if
This is equivalent to : the Fourier coefficients of with respect to the left singular vectors must decay sufficiently fast relative to the singular values.
For exact data , the Picard condition is automatically satisfied: the coefficients are and the ratio gives . For noisy data , the noise coefficients do not decay, so the Picard condition fails and diverges.
Theorem: Unboundedness of the Pseudoinverse for Compact Operators
Let be a compact operator between infinite-dimensional Hilbert spaces with infinite-dimensional. Then is unbounded.
Construct an unbounded sequence
Consider (the left singular vectors). Then for all , but
as (since for compact operators on infinite-dimensional spaces).
Conclude
Since , the operator is unbounded. It cannot be extended to a bounded operator on all of .
Example: Noise Amplification Through the Pseudoinverse
Consider an integral equation on with singular values (mildly ill-posed). The true solution has coefficients .
The data are corrupted by noise with i.i.d. coefficients . Compute the expected reconstruction error .
Expand the error
$
Compute the expectation
\delta = 10^{-10}$, the reconstruction is meaningless.
Physical interpretation
The noise has equal energy across all singular components, but the pseudoinverse amplifies the -th component by . The resulting error grows as , overwhelming the signal components which decay as . Regularization — which damps the high-frequency amplification — is essential.
Noise Amplification Through the Pseudoinverse
Demonstrates the catastrophic noise amplification inherent in the pseudoinverse. A 1D signal is mapped through a compact forward operator with singular values , and white noise of level is added.
Left panel: SVD coefficients of the exact data (blue, decaying) and the noisy data (red, levelling off at ). The crossover point where noise dominates signal is clearly visible.
Right panel: The pseudoinverse reconstruction using the first components. When is too large, noise-dominated components destroy the reconstruction. This motivates truncated SVD and Tikhonov regularization (Section 2.4).
Increase the noise level to see how the safe truncation index decreases. This trade-off between resolution and stability is the fundamental dilemma of ill-posed problems.
Parameters
Common Mistake: The Pseudoinverse Is Not a Reconstruction Method
Mistake:
Computing (the pseudoinverse applied to noisy data) as the reconstruction of an ill-conditioned imaging problem, expecting it to give a reasonable image.
Correction:
For ill-conditioned systems, catastrophically amplifies noise: the -th SVD component of the result is , which grows without bound as . The pseudoinverse is a mathematical concept (the minimum-norm least-squares solution), not a practical algorithm for noisy data. Always apply regularization.
Picard Condition
A datum satisfies the Picard condition for the operator if , which is necessary and sufficient for to be well-defined.
Related: Well-Posed Problem, Degree of Ill-Posedness
Key Takeaway
The Moore–Penrose pseudoinverse gives the minimum-norm least-squares solution via . For compact operators, is always unbounded — it cannot be used directly with noisy data. The Picard condition determines when is well-defined: the data coefficients must decay faster than . Noise violates this condition, driving the reconstruction error to infinity. This is the mathematical justification for regularization theory.