OAMP — Orthogonal Approximate Message Passing
Why OAMP?
In Chapter 20 we saw that AMP is beautiful when it works — it tracks a one-dimensional state-evolution recursion and achieves the Bayes-optimal MSE for i.i.d. Gaussian sensing matrices. But the moment we step outside the i.i.d. regime — to partial Fourier matrices, to Kronecker-structured operators from MIMO and imaging, to any matrix with a non-flat spectrum — AMP diverges, sometimes dramatically. Damping helps but does not cure.
The diagnosis is simple. AMP relies on the Onsager correction to decorrelate the residual from the current estimate, so that the pseudo-observation really does look like signal-plus-white-Gaussian-noise. When has non-i.i.d. entries, the Onsager correction no longer does this job, and the denoiser is fed a residual with structured correlations.
OAMP (Orthogonal AMP) fixes this by replacing the matched filter with a divergence-free linear estimator, most naturally the LMMSE filter. The orthogonality between the linear estimation error and the denoising error is enforced directly, rather than approximately. This widens the class of matrices for which the state-evolution analysis is correct — specifically, OAMP is provably correct for the right-rotationally-invariant class, a much larger family than i.i.d. Gaussian.
Definition: Right-Rotationally-Invariant Matrix Ensemble
Right-Rotationally-Invariant Matrix Ensemble
A random matrix is called right-rotationally-invariant (RRI) if its distribution is unchanged by right-multiplication by any deterministic unitary matrix :
Equivalently, writing the SVD as , the right singular basis is uniformly distributed on the unitary group, independently of the singular values and of .
This class strictly contains the i.i.d. Gaussian ensemble (for which both and are Haar and the singular values follow Marchenko-Pastur), but it also contains orthogonal / partial DFT matrices, randomly subsampled unitary matrices, and many designed sensing operators that arise in MIMO and imaging.
Definition: Divergence-Free Linear Estimator
Divergence-Free Linear Estimator
Given a pseudo-observation , , and a linear operator mapping residuals back to the signal domain, the function is divergence-free with respect to when
The first form gives a zero-mean residual correction; the second is the common unbiased normalization used in OAMP. In both cases the role is the same: the output of the linear step is uncorrelated with the input error, so the effective noise passed to the denoiser is truly fresh.
"Divergence-free" is language borrowed from the state-evolution derivation — it is the condition that cancels the Onsager term that plain AMP inserts by hand.
Definition: OAMP Iteration
OAMP Iteration
Given the linear model with , the OAMP algorithm iterates, for :
where is the divergence-free LMMSE filter, is a componentwise denoiser matched to the signal prior, is the average divergence, and is a normalization constant that restores unit-gain on .
The subtraction of from is the divergence-free correction on the denoiser side. Together with the divergence-free on the linear side, it enforces orthogonality between the two error components — hence "Orthogonal" AMP.
Theorem: OAMP State Evolution for RRI Matrices
Assume is right-rotationally-invariant with asymptotic spectrum of , and has i.i.d. entries with prior . Let be the LMMSE filter
and let be the MSE at iteration . Then, in the large-system limit with fixed, the pseudo-observation satisfies
with independent of , and the scalar state evolves according to
where is the linear-step transfer function (determined by the spectrum of ) and is the MSE of the denoiser at noise level .
The theorem says that OAMP's one-dimensional state-evolution description — the same kind of recursion that makes AMP tractable — survives for the entire right-rotationally-invariant class. You trade a matrix-vector product with for an LMMSE-style solve, and in return you get a much wider regime of validity.
Set up the error decomposition
Write . With the normalization , the first term has mean zero conditional on the previous iterate. The second term is a linear transformation of Gaussian noise.
Exploit rotational invariance of $\mathbf{V}$
Because is Haar-distributed, conditioning on leaves the residual basis uniformly distributed on the subspace orthogonal to the history. This is the conditional uniformity principle — the same tool used in the AMP state-evolution proof, extended to the RRI class.
Compute the asymptotic variance
Using the SVD and the Haar property of , the conditional variance of reduces to a spectral integral:
where depends only on the spectrum , not on the realization. This is the function .
Close the recursion with the denoiser MSE
For any Lipschitz componentwise denoiser , the law of large numbers gives , which is the scalar function . The divergence-free correction on guarantees that is orthogonal to the next linear-step error, closing the loop.
Key Takeaway
OAMP replaces AMP's matched filter with an LMMSE filter and enforces orthogonality between the linear-step error and the denoising error. Orthogonality is the statistical condition that keeps the scalar state-evolution recursion correct, so OAMP generalizes cleanly beyond the i.i.d. Gaussian class to right-rotationally-invariant matrices.
Computational Cost of the LMMSE Step
The LMMSE filter requires solving a linear system of the form . In general this costs per iteration, which is the main practical barrier to deploying OAMP. For structured operators (Kronecker, partial DFT, subsampled unitary), the cost drops dramatically — we treat the Kronecker case in section 21.3.
In imaging pipelines where can be tens of thousands, the linear step typically dominates the runtime. Practical implementations use Woodbury identities, diagonalization in the SVD basis, or CG with a few inner iterations warm-started from the previous outer step.
Example: OAMP for a Partial Orthogonal Sensing Matrix
Consider where is an DFT matrix and selects rows at random. This is a classic compressed-sensing operator. Compute the effective noise variance after one OAMP iteration when is the current MSE and is the measurement noise.
Identify the spectrum
For the partial-DFT operator, after normalization, so all non-zero singular values equal . The spectrum is a single mass point at , with a zero eigenvalue of multiplicity on the co-range.
Apply the LMMSE filter
The LMMSE filter becomes . With the single-mass spectrum, this is simply a rescaled with scalar factor .
Compute $\tau_t^2$
After normalization to , the effective noise variance works out to
where . The first term is the aliasing variance from the unmeasured directions (pure signal uncertainty), and the second is the amplified measurement noise. This clean additive decomposition is the hallmark of OAMP state evolution.
OAMP vs AMP on Structured Matrices
Compare the MSE trajectories of AMP (matched-filter linear step) and OAMP (LMMSE linear step) on three sensing ensembles: i.i.d. Gaussian, partial DFT, and column-correlated Gaussian. AMP tracks state evolution only for i.i.d. Gaussian; OAMP tracks it for all three.
Parameters
Ratio of measurements to signal dimension
Fraction of non-zero entries in $\mathbf{x}$
OAMP with Bayes-Optimal Denoiser
Complexity: Dominated by the LMMSE solve: per iteration in general, or for partial-DFT sensing, or for separable Kronecker sensing (section 21.3).The denominator is exactly the "Onsager-free" rescaling that makes the denoised signal orthogonal to its input pseudo-observation.
Common Mistake: Forgetting the Divergence-Free Normalization
Mistake:
Implementing the LMMSE step as without the scalar , or forgetting the analogous scaling on the denoiser side.
Correction:
Both normalizations are essential. Without on the linear side, the signal passes through the filter with a gain , so the denoiser is matched to the wrong effective prior. Without the scaling on the denoiser side, the orthogonality condition fails and state evolution no longer tracks the empirical dynamics. In practice both missing factors manifest as a state-evolution curve that diverges from the simulated MSE.
OAMP as AMP with a Better Linear Step
Comparing AMP and OAMP side by side makes the difference crisp. AMP uses (the matched filter, or "adjoint"), and pays for its bias with the Onsager term. OAMP uses the LMMSE , which is already approximately unbiased, and the residual bias is removed by the divergence-free normalization.
When is i.i.d. Gaussian, the two filters are essentially equivalent in the large-system limit, which is why AMP works in that regime. When has a non-flat spectrum, the matched filter introduces a coloring that AMP cannot undo, while OAMP's LMMSE filter whitens the residual. This is the full story.
Historical Note: From Turbo Equalization to OAMP
2017The divergence-free-plus-denoiser architecture has deep roots in turbo equalization (Douillard, Jézéquel, Berrou, 1995) and in the concept of extrinsic information from iterative decoding: the output of each module must be statistically independent of its input, otherwise beliefs are reinforced rather than refreshed.
Junjie Ma and Li Ping's 2017 paper Orthogonal AMP carried this principle over to compressed sensing. Independently, Schniter, Rangan, and Fletcher arrived at essentially the same algorithm by a different route — expectation consistency — and called it VAMP (section 21.2). The two derivations are now understood as complementary views of a single algorithm. The name "OAMP" emphasizes orthogonality; "VAMP" emphasizes the message-passing factor structure. Both are correct and both are used.
OAMP (Orthogonal AMP)
A modification of AMP that replaces the matched-filter linear step with an LMMSE filter and enforces the divergence-free condition on both the linear estimator and the denoiser. OAMP converges and admits state evolution for the right-rotationally-invariant class of sensing matrices.
Related: Divergence-free estimator, Right-rotationally-invariant matrix
Divergence-free estimator
A linear or nonlinear estimator whose derivative (divergence) with respect to its input vanishes on average, ensuring that the output error is orthogonal to the input noise. OAMP enforces this by trace-normalizing the LMMSE filter and subtracting from the denoiser output.
Related: OAMP (Orthogonal AMP)
Right-rotationally-invariant matrix
A random matrix whose distribution is invariant under right-multiplication by any unitary matrix. Equivalently, its right singular basis is Haar-distributed. This class includes i.i.d. Gaussian matrices, random partial unitary / partial DFT matrices, and many other structured ensembles relevant to imaging.
Related: OAMP (Orthogonal AMP)
Quick Check
Which of the following statements about right-rotationally-invariant (RRI) sensing matrices is TRUE?
All i.i.d. Gaussian matrices are RRI, and the converse is also true.
RRI means the left singular basis is Haar-distributed.
Partial DFT matrices are RRI because their right singular basis is uniformly random on the unitary group.
If is RRI, then is also RRI.
Yes — the random row selection makes effectively Haar-distributed on the subspace of interest, which is the defining property of the RRI class.
Why This Matters: Why OAMP Matters for RF Imaging
Imaging pipelines in the CommIT group build sensing matrices from physical propagation: Kronecker products of steering-vector dictionaries, subsampled DFT rows from OFDM pilots, and near-field operators with slowly decaying singular-value spectra. None of these are i.i.d. Gaussian. Running plain AMP on such operators is numerically unstable, regardless of how many damping heuristics are stacked on top.
OAMP is the natural algorithmic home for these problems: it tolerates the non-flat spectrum, it admits principled state-evolution analysis (so we can predict reconstruction quality from the operator's singular values alone), and its LMMSE step can be computed efficiently by exploiting the Kronecker structure. This connection is developed in section 21.3 and reappears in Book 2 Chapter 27 where unrolled OAMP becomes the backbone of the RF imaging network.
See full treatment in Chapter 27، Section sec-lista-imaging