Convex Relaxation: PhaseLift

PhaseLift β€” Convex Relaxation via Semidefinite Programming

The first rigorous approach to phase retrieval with provable guarantees is PhaseLift (Candes, Strohmer, and Voroninski, 2013). The key idea is lifting: reformulate the quadratic measurements as linear measurements of a rank-1 matrix, then relax the rank constraint to a semidefinite program (SDP). This converts the non-linear, non-convex phase retrieval problem into a convex optimization problem.

The point is that convexity buys us guaranteed global optimality β€” no local minima, no initialization sensitivity. The price is computational: we move from an NN-dimensional signal to an NΓ—NN \times N matrix variable.

Definition:

The Lifting Trick

The intensity measurement yi=∣⟨ai,x⟩∣2y_i = |\langle \mathbf{a}_i, \mathbf{x}\rangle|^2 can be rewritten as:

yi=aiHxxHai=tr(aiaiHX)=tr(AiX),y_i = \mathbf{a}_i^H \mathbf{x}\mathbf{x}^H \mathbf{a}_i = \text{tr}(\mathbf{a}_i\mathbf{a}_i^H \mathbf{X}) = \text{tr}(\mathbf{A}_i \mathbf{X}),

where X=xxH∈CNΓ—N\mathbf{X} = \mathbf{x}\mathbf{x}^H \in \mathbb{C}^{N \times N} and Ai=aiaiH\mathbf{A}_i = \mathbf{a}_i\mathbf{a}_i^H.

The measurements are now linear in X\mathbf{X}:

yi=tr(AiX),i=1,…,M.y_i = \text{tr}(\mathbf{A}_i \mathbf{X}), \quad i = 1, \ldots, M.

The constraint that X=xxH\mathbf{X} = \mathbf{x}\mathbf{x}^H is equivalent to: Xβͺ°0\mathbf{X} \succeq 0 and rank(X)=1\text{rank}(\mathbf{X}) = 1. The rank constraint is what makes the problem hard.

Lifting (Phase Retrieval)

A technique that replaces the unknown signal x\mathbf{x} by the rank-1 matrix X=xxH\mathbf{X} = \mathbf{x}\mathbf{x}^H, converting quadratic measurements into linear constraints on X\mathbf{X}. The rank-1 constraint is then relaxed to a semidefinite constraint Xβͺ°0\mathbf{X} \succeq 0.

Related: The Phase Retrieval Problem

Theorem: PhaseLift Recovery Guarantee

Consider the semidefinite program (PhaseLift):

min⁑Xβͺ°0β€…β€Štr(X)s.t.tr(AiX)=yi,i=1,…,M,\min_{\mathbf{X} \succeq 0} \; \text{tr}(\mathbf{X}) \quad \text{s.t.} \quad \text{tr}(\mathbf{A}_i\mathbf{X}) = y_i, \quad i = 1, \ldots, M,

where Ai=aiaiH\mathbf{A}_i = \mathbf{a}_i\mathbf{a}_i^H and yi=∣⟨ai,x0⟩∣2y_i = |\langle \mathbf{a}_i, \mathbf{x}_0\rangle|^2.

If ai∼CN(0,I)\mathbf{a}_i \sim \mathcal{CN}(\mathbf{0}, \mathbf{I}) independently and Mβ‰₯CNlog⁑NM \geq CN\log N for a universal constant CC, then with high probability the unique solution X^\hat{\mathbf{X}} is rank-1, and x^=Ξ»1v1\hat{\mathbf{x}} = \sqrt{\lambda_1}\mathbf{v}_1 (the leading eigenvector scaled by the leading eigenvalue) recovers x0\mathbf{x}_0 up to global phase.

Trace minimization for PSD matrices is the convex envelope of rank minimization β€” analogous to how the β„“1\ell_1 norm is the convex envelope of the β„“0\ell_0 "norm" in compressed sensing. When the measurements are sufficiently diverse (random Gaussian), the trace-minimizing solution naturally has rank 1.

Example: PhaseLift for a Small Problem

Setup: N=16N = 16 (complex signal), M=128M = 128 Gaussian measurements, noiseless.

The SDP has optimization variable X∈C16Γ—16\mathbf{X} \in \mathbb{C}^{16 \times 16} (Hermitian PSD) with 128 linear equality constraints. Solve using an interior-point SDP solver and examine the eigenvalue distribution of X^\hat{\mathbf{X}}.

Definition:

Noisy PhaseLift

For noisy measurements yi=∣⟨ai,x⟩∣2+ηiy_i = |\langle \mathbf{a}_i, \mathbf{x}\rangle|^2 + \eta_i, the equality constraints become infeasible. The noisy PhaseLift formulation uses a relaxed constraint:

min⁑Xβͺ°0β€…β€Štr(X)s.t.βˆ‘i=1M(tr(AiX)βˆ’yi)2≀ϡ,\min_{\mathbf{X} \succeq 0} \; \text{tr}(\mathbf{X}) \quad \text{s.t.} \quad \sum_{i=1}^M (\text{tr}(\mathbf{A}_i\mathbf{X}) - y_i)^2 \leq \epsilon,

where Ο΅\epsilon bounds the total noise energy. Alternatively, a penalized form:

min⁑Xβͺ°0β€…β€Šβˆ‘i=1M(tr(AiX)βˆ’yi)2+λ tr(X).\min_{\mathbf{X} \succeq 0} \; \sum_{i=1}^M (\text{tr}(\mathbf{A}_i\mathbf{X}) - y_i)^2 + \lambda\,\text{tr}(\mathbf{X}).

The recovery error degrades gracefully: βˆ₯X^βˆ’X0βˆ₯F≀CΟ΅/M\|\hat{\mathbf{X}} - \mathbf{X}_0\|_F \leq C\epsilon/\sqrt{M} with high probability.

Definition:

Scalable Variants of PhaseLift

Several approaches address PhaseLift's computational cost:

1. PhaseCut (Waldspurger, d'Aspremont, and Mallat, 2015): Instead of lifting to NΓ—NN \times N, optimize over the phases directly:

maxβ‘Ο•β€…β€Šβˆ‘iyi Re(ejΟ•i⟨ai,x⟩),\max_{\boldsymbol{\phi}} \; \sum_{i} \sqrt{y_i}\,\text{Re}(e^{j\phi_i} \langle \mathbf{a}_i, \mathbf{x}\rangle),

which is a max-cut-like SDP of smaller dimension.

2. Sketched PhaseLift (Yurtsever et al., 2017): Apply random projections to reduce the SDP dimension from N2N^2 to O(N)O(N), maintaining theoretical guarantees while reducing cost to O(N2)O(N^2) per iteration.

3. Low-rank SDP solvers (Burer--Monteiro): Set X=VVH\mathbf{X} = \mathbf{V}\mathbf{V}^H with V∈CNΓ—r\mathbf{V} \in \mathbb{C}^{N \times r} (rβ‰ͺNr \ll N) and solve the non-convex problem over V\mathbf{V}. For rank-1 recovery, r=2r = 2 or 33 suffices.

Common Mistake: PhaseLift Does Not Scale to Real Imaging Problems

Mistake:

PhaseLift's elegant theory tempts one to use it for practical imaging. However, the computational cost is prohibitive:

NN (signal dim) SDP variable size Typical solve time
16 16Γ—16=25616 \times 16 = 256 1 s
64 64Γ—64=4,09664 \times 64 = 4{,}096 30 s
256 256Γ—256=65,536256 \times 256 = 65{,}536 hours
1024 1024Γ—1024=1061024 \times 1024 = 10^6 intractable

The O(N2)O(N^2) memory and O(N6)O(N^6) time of interior-point SDP solvers make PhaseLift impractical beyond N∼256N \sim 256.

Correction:

PhaseLift is a theoretical milestone β€” it proves that phase retrieval is solvable in polynomial time. For practical imaging-scale problems (N>1000N > 1000), use the non-convex methods of Section 16.3 (Wirtinger flow and variants), which operate directly on the NN-dimensional signal.

Convex (PhaseLift) vs. Non-Convex (Wirtinger Flow) Phase Retrieval

PropertyPhaseLift (Convex)Wirtinger Flow (Non-Convex)
Variable dimensionNΓ—NN \times N matrixNN-vector
Per-iteration costO(N4.5)O(N^{4.5}) (SDP interior point)O(MN)O(MN)
MemoryO(N2)O(N^2)O(MN)O(MN)
Measurement requirementM=O(Nlog⁑N)M = O(N\log N)M=O(Nlog⁑N)M = O(N\log N); O(N)O(N) with truncation
InitializationNot needed (convex)Critical (spectral initialization)
Convergence guaranteeGlobal optimum (convex)Global optimum w.h.p. (spectral init + RIP)
Practical for N>256N > 256?NoYes

Quick Check

In PhaseLift, a signal x∈C64\mathbf{x} \in \mathbb{C}^{64} is "lifted" to a matrix X=xxH\mathbf{X} = \mathbf{x}\mathbf{x}^H. How many real-valued unknowns does X\mathbf{X} have (exploiting Hermitian symmetry)?

64

4096

8192

2016

πŸ”§Engineering Note

SDP Solver Choices for PhaseLift

Interior-point SDP solvers (SeDuMi, MOSEK, SDPT3) provide guaranteed convergence but require O(N2)O(N^2) memory and O(N4.5)O(N^{4.5}) per iteration. For PhaseLift with N>100N > 100:

  • MOSEK is the fastest commercial solver; handles N≀500N \leq 500 in minutes.
  • SCS (splitting conic solver) uses first-order methods; lower per-iteration cost but slower convergence β€” suitable for NN up to ∼\sim2000 at moderate accuracy.
  • Burer--Monteiro factorization reduces to a non-convex problem in CNΓ—r\mathbb{C}^{N \times r} β€” practical for N>1000N > 1000 but loses the convexity guarantee.

In research prototypes, CVXPY with MOSEK backend is the recommended interface for PhaseLift experiments.

Practical Constraints
  • β€’

    Interior-point SDP solvers require O(N2)O(N^2) memory, limiting N≀500N \leq 500 on typical workstations

  • β€’

    First-order SDP solvers (SCS) trade accuracy for scalability: convergence to 10βˆ’310^{-3} relative gap in O(N2)O(N^2) iterations

  • β€’

    GPU-accelerated SDP solvers exist but provide only 2–5Γ—2\text{--}5\times speedup for the matrix sizes in phase retrieval

Key Takeaway

Lifting converts the quadratic phase retrieval problem into a linear problem over NΓ—NN \times N PSD matrices: yi=tr(AiX)y_i = \text{tr}(\mathbf{A}_i \mathbf{X}). PhaseLift minimizes tr(X)\text{tr}(\mathbf{X}) subject to measurement constraints β€” a convex SDP that provably recovers rank-1 solutions with M=O(Nlog⁑N)M = O(N\log N) Gaussian measurements. PhaseLift provides the first polynomial-time guarantee for phase retrieval, establishing theoretical feasibility. However, O(N2)O(N^2) memory and O(N6)O(N^6) computation make it impractical beyond N∼256N \sim 256.