Phase Transitions in Sparse Recovery

Sharp Thresholds in Sparse Recovery

In Chapter 14 we proved that the LASSO recovers a sparse signal x⋆\mathbf{x}^\star from y=Ax⋆+w\mathbf{y} = \mathbf{A}\mathbf{x}^\star + \mathbf{w} provided the sensing matrix satisfies the restricted isometry property (RIP) and the number of measurements exceeds M≳slog⁑(N/s)M \gtrsim s \log(N/s).

That bound is sufficient, and it captures the correct scaling. But for a system designer, the multiplicative constants are what matter: does one need M=4slog⁑(N/s)M = 4s\log(N/s) or M=20slog⁑(N/s)M = 20s\log(N/s)? The point of this section is that for Gaussian A\mathbf{A}, the gap between success and failure is not gradual β€” it is a phase transition with a sharp threshold that can be computed exactly. The Donoho-Tanner curve tells the designer whether their (M,N,s)(M, N, s) triple lies on the good side or the bad side of the cliff.

Definition:

Phase Transition Regime

Consider the noiseless sparse recovery problem x^=arg⁑min⁑βˆ₯xβˆ₯1s.t.Ax=y,\hat{\mathbf{x}} = \arg\min \|\mathbf{x}\|_1 \quad \text{s.t.} \quad \mathbf{A}\mathbf{x} = \mathbf{y}, where A∈RMΓ—N\mathbf{A} \in \mathbb{R}^{M \times N} has i.i.d. N(0,1/M)\mathcal{N}(0, 1/M) entries and x⋆\mathbf{x}^\star is ss-sparse. Define the undersampling ratio Ξ΄=M/N∈(0,1]\delta = M/N \in (0, 1] and the sparsity ratio ρ=s/M\rho = s/M. The phase transition regime is the double asymptotic Nβ†’βˆžN \to \infty with (Ξ΄,ρ)(\delta, \rho) fixed.

Ξ΄\delta measures how aggressively we undersample relative to the ambient dimension. ρ\rho measures how many of the measurements are "spent" per nonzero entry. The regime Ξ΄β†’0\delta \to 0 with ρ\rho bounded is the high-compression limit.

Theorem: Donoho-Tanner Phase Transition

For A\mathbf{A} with i.i.d. Gaussian entries, there exists a deterministic curve Οβˆ—(Ξ΄):(0,1]β†’(0,1]\rho^*(\delta): (0, 1] \to (0, 1] such that in the double limit Nβ†’βˆžN \to \infty: Pr⁑(x^=x⋆)β†’{1ρ<Οβˆ—(Ξ΄)0ρ>Οβˆ—(Ξ΄)\Pr(\hat{\mathbf{x}} = \mathbf{x}^\star) \to \begin{cases} 1 & \rho < \rho^*(\delta) \\ 0 & \rho > \rho^*(\delta) \end{cases} The curve Οβˆ—(Ξ΄)\rho^*(\delta) is characterized implicitly by the solution of a geometric covering problem involving random convex polytopes (Donoho, 2005). Specifically, Οβˆ—(Ξ΄)\rho^*(\delta) is the fraction of (Mβˆ’1)(M-1)-faces of the cross-polytope B1N\mathbb{B}_1^N that survive random projection to RM\mathbb{R}^M.

The transition is sharp: for any Ο΅>0\epsilon > 0, the probability of recovery jumps from below Ο΅\epsilon to above 1βˆ’Ο΅1 - \epsilon as ρ\rho crosses Οβˆ—(Ξ΄)\rho^*(\delta) in a band of width O(1/N)O(1/\sqrt{N}). Below the curve, β„“1\ell_1 minimization recovers x⋆\mathbf{x}^\star exactly with probability one in the limit; above it, recovery fails. The curve captures the best possible tradeoff between measurements and sparsity for any convex-relaxation recovery method over Gaussian matrices.

, ,

The Donoho-Tanner Phase Transition Curve

The Donoho-Tanner Phase Transition Curve
Phase transition curve Οβˆ—(Ξ΄)\rho^*(\delta) for the noiseless sparse recovery problem with Gaussian A\mathbf{A}. Below the curve, β„“1\ell_1 minimization succeeds almost surely; above, it fails almost surely. The band around the curve is O(Nβˆ’1/2)O(N^{-1/2}) wide.

Donoho-Tanner Curve and Empirical Phase Diagram

Overlay the theoretical curve Οβˆ—(Ξ΄)\rho^*(\delta) with simulated success probability for finite NN. Move the slider to see the transition sharpen as NN grows.

Parameters
200
15
15

Example: Sizing a Sparse Recovery System

You need to reconstruct a signal with N=10,000N = 10{,}000 coefficients of which s=200s = 200 are nonzero. How many Gaussian measurements MM does the Donoho-Tanner theory predict you need, assuming a weak-threshold operating point?

Sparse Recovery: Theoretical vs. Practical Thresholds

MethodWeak threshold Οβˆ—(Ξ΄)\rho^*(\delta)Noise robust?Complexity
β„“1\ell_1 / LASSODonoho-Tanner curve (tight)Yes (LASSO)O(N3)O(N^3)
Orthogonal Matching PursuitSlightly below DT curveModerateO(sβ‹…MN)O(s \cdot MN)
Approximate Message Passing (AMP)Matches DT curve asymptoticallyYes (state evolution)O(MN)O(MN) per iter
CoSaMP / IHTBelow DT curveYesO(MN)O(MN) per iter

AMP Saturates the Curve

Approximate Message Passing (Chapter 20) achieves the Donoho-Tanner curve with O(MN)O(MN) per-iteration complexity β€” dramatically faster than interior-point solvers for LASSO. This is possible because AMP's state evolution recursion provably tracks the LASSO solution in the high-dimensional limit. The practical implication: Donoho-Tanner is not just a theoretical benchmark; it is operationally achievable on problems with N=105N = 10^5 or larger.

Common Mistake: Non-Gaussian Sensing Matrices

Mistake:

Applying the Donoho-Tanner curve to structured sensing matrices (partial Fourier, Bernoulli, binary) without adjustment.

Correction:

The curve was derived for i.i.d. Gaussian A\mathbf{A}. Partial Fourier matrices (used in MRI) exhibit a different, slightly more conservative phase transition. Bernoulli Β±1/M\pm 1/\sqrt{M} matrices are very close to Gaussian but not identical. Sub-Gaussian concentration ensures the curve is a good approximation, but constants should be verified by simulation before committing to a design.

Phase transition (in sparse recovery)

A sharp threshold in the (δ,ρ)=(M/N,s/M)(\delta, \rho) = (M/N, s/M) plane separating regimes of successful and failed recovery. The width of the transition band shrinks as O(1/N)O(1/\sqrt{N}), so in the high-dimensional limit it is a cliff.

Related: Donoho-Tanner curve, LASSO Estimator, Restricted isometry property

Quick Check

A sensing system uses M=500M = 500 random Gaussian measurements on an N=5000N = 5000 dimensional signal. The Donoho-Tanner curve gives Οβˆ—(0.1)β‰ˆ0.19\rho^*(0.1) \approx 0.19. What is the maximum ss-sparse signal the system can reliably recover?

s≀95s \leq 95

s≀500s \leq 500

s≀950s \leq 950

s≀19s \leq 19

πŸ”§Engineering Note

Safety Margins in Practice

In deployed systems, operate with ρ\rho at roughly 60-70% of Οβˆ—(Ξ΄)\rho^*(\delta) to tolerate noise, model mismatch, and finite-NN effects. AMP-based recovery with online noise-variance estimation is the standard choice when N>104N > 10^4.

πŸ“‹ Ref: Donoho-Maleki-Montanari 2009

Key Takeaway

The Donoho-Tanner phase transition is a sharp cliff in the (Ξ΄,ρ)(\delta, \rho) plane: below the curve, β„“1\ell_1-recovery succeeds with probability one; above, it fails with probability one. For Gaussian sensing matrices the curve is known explicitly. It is the gold standard benchmark for any practical sparse-recovery algorithm.