Expectation Propagation
When BP Is Not Enough
Belief propagation is tractable when the messages have closed forms β discrete distributions, Gaussians, or mixtures with small support. When variables are continuous and the factors are non-conjugate (for example, a discrete-constellation prior combined with Gaussian likelihood factors, as in MIMO detection with 64-QAM), sum-product messages become intractable mixtures that grow with each iteration.
Expectation Propagation (EP), introduced by Minka in 2001, fixes this by forcing every message to lie in a tractable exponential family (usually Gaussian) via moment matching. EP is to Gaussian BP what the extended Kalman filter is to nonlinear filtering: it projects non-Gaussian quantities onto their closest Gaussian by matching the first two moments. In MIMO detection, EP delivers near-ML performance at LMMSE complexity and has become the state-of-the-art iterative detector for large-MIMO with high-order constellations.
Definition: KL-Projection onto the Gaussian Family
KL-Projection onto the Gaussian Family
Let be the exponential family of Gaussian distributions (natural parameters mean and precision). For any probability density , define The minimizer is the Gaussian whose mean and variance match those of : This is the unique moment-matching Gaussian approximation.
The KL direction matters: (moment matching) gives a Gaussian that covers the support of , while the reverse (variational/mean-field) tends to under-cover. EP uses the moment-matching direction.
Theorem: KL Minimization Equals Moment Matching
Let be any density with finite first two moments and belong to the Gaussian family with mean and variance . Then is minimized uniquely at , .
For any exponential family, minimizing reduces to matching sufficient-statistic expectations β this is the standard exponential-family duality.
KL expansion for Gaussian $q$
Write , where is the differential entropy of .
Minimize in $\mu$
The only -dependence is , minimized at .
Minimize in $\sigma^2$
Setting derivative to zero: .
Definition: EP Messages and the Cavity Distribution
EP Messages and the Cavity Distribution
Suppose the posterior factorizes as and we approximate it by with chosen from a tractable (Gaussian) family. For each factor define the cavity i.e., the current approximation with factor removed. The EP update for is: In words: replace the approximate factor by the true factor, project the product back to the Gaussian family by moment matching, then divide out the cavity to get the new approximation factor.
Division of Gaussians is well-defined in natural-parameter (precision-and-precision-weighted-mean) form; the result is a Gaussian with possibly negative precision. EP becomes unstable when precisions turn negative β see the pitfalls below.
Expectation Propagation for MIMO Detection
Complexity: dominated by the matrix inversion per iterationThe moment matching in step 6 is a closed-form sum over constellation points for discrete priors β the same Gaussian-weighted constellation average that appears in every symbol demapper. Damping is standard for MIMO detection.
Theorem: EP Fixed Points Are Stationary Points of the Bethe Free Energy
Any fixed point of the EP iteration is a stationary point of a specific "expectation-constrained" free-energy functional, subject to moment-matching constraints between the factor approximations and the global approximation. Under these moment constraints reduces (for tree-structured graphs) to the exact log-partition. For loopy graphs, EP fixed points coincide with stationary points of the Bethe-like free energy whose first variations are zero.
Just as loopy BP is a fixed-point method for the Bethe free energy, EP is a fixed-point method for an exponential-family generalization. This is why EP produces meaningful marginals even on loopy graphs β it is not just heuristic projection.
Free-energy formulation
Each EP factor update minimizes over Gaussian , which is the moment matching step. The corresponding Lagrangian enforces moment consistency between the cavity-tilted distribution and the global approximation.
Stationarity
At a fixed point, no local moment matching update changes ; the Lagrangian's gradient in the natural parameters vanishes. This is exactly the stationarity condition for the expectation-constrained Bethe free energy.
Tree case
On a tree, cavities are exact and the Bethe free energy equals the exact free energy, so EP returns exact Gaussian moments of the true posterior (projected).
EP vs LMMSE vs ML: BER for Large MIMO Detection
Compare the BER of EP-based detection to LMMSE detection, MMSE-SIC, and the ML lower bound for an MIMO system with 64-QAM. Observe the EP gain at high SNR where LMMSE suffers from ill-conditioning.
Parameters
Example: Hand Computation: EP for BPSK
Let , , and BPSK symbols with uniform prior. Given , perform one EP update starting from flat cavities ().
Gaussian pass
With : . Then and . Computing: , so .
Cavity at $i=1$
With flat cavity (), the cavity marginal is just the Gaussian marginal from the posterior with the BPSK constraint ignored: .
Moment match against BPSK prior
Tilted density: . Compute , and .
Update cavity parameters
Set and . Apply damping, repeat for , and iterate. EP converges within 3β5 iterations to posterior means near .
Common Mistake: Negative Precisions Break EP
Mistake:
An over-confident cavity (very small ) combined with a weakly informative factor can produce , which yields : the Gaussian approximation's precision is negative, meaning it is not a valid density.
Correction:
Use damping and, if necessary, clip precisions to a small positive floor . Heavy damping () always fixes divergence but slows convergence. Many production EP detectors use .
Common Mistake: Initializing EP Cavities Matters
Mistake:
Initializing all to zero and to zero can lead to a badly conditioned Gaussian pass and slow early convergence.
Correction:
A standard initialization is (prior second moment reciprocal) and . This means "start from the LMMSE solution," after which EP refines the marginals. It also guarantees at iteration 1.
EP as Refined Belief Propagation
EP reduces to loopy belief propagation when the exponential family is the full factorized distribution over the variables (no projection needed). It reduces to Gaussian BP (GaBP) when all factors are already Gaussian. The distinctive contribution of EP appears in the discrete-prior-Gaussian-factor case, where it produces Gaussian approximate messages that are strictly better than the ones used by GaBP applied to the relaxed continuous problem.
EP Detectors in Advanced Massive MIMO Receivers
EP-based detection has emerged as the favored algorithm for large-MIMO uplink with high-order constellations (64-QAM, 256-QAM) in 5G-Advanced and 6G prototypes. Unlike ML-based sphere decoding, EP has polynomial per-iteration cost and fixed runtime; unlike LMMSE, it exploits the discrete constellation prior. Real-time FPGA implementations at , 64-QAM, and β EP iterations have been reported with throughput matching LMMSE-SIC, while attaining within dB of ML.
- β’
Iterations fixed at design time (4β8 typical)
- β’
Damping factor often frozen at on silicon
EP Receivers for RIS-Assisted Massive MIMO
The CommIT group has applied EP-based iterative detection to
reconfigurable intelligent surface (RIS) assisted MIMO uplink, where
the effective channel is the sum of a direct-path term and an
RIS-reflected term with phase-configurable columns. EP naturally
handles the joint uncertainty in the RIS phases and the data streams,
matching the performance of joint ML estimation at a tiny fraction of
the cost. This line of work connects the iterative receivers of this
chapter to the RIS book (Book ris).
Expectation Propagation
An iterative message-passing algorithm that approximates a complex posterior by a product of exponential-family factors. Each factor is updated by projecting the product of the true factor and the cavity distribution onto the tractable family via moment matching.
Cavity Distribution
The distribution obtained from a factorized approximation by removing one of its factors. In EP, the cavity represents "everything the model knows about a variable except what factor says."
Related: Expectation Propagation
Moment Matching
The procedure of choosing an exponential-family distribution whose first few moments (typically mean and variance) agree with those of a target distribution. This is the KL-projection of the target onto the family.
Iterative MIMO Detectors β Key Properties
| Detector | Complexity per iter | Prior used | Typical gap to ML |
|---|---|---|---|
| LMMSE (non-iterative) | Gaussian relaxation | 3β6 dB (high-SNR, high-order QAM) | |
| MMSE-SIC (soft) | Gaussian soft symbol | 1β2 dB | |
| Gaussian BP | Gaussian relaxation | 2β4 dB | |
| Expectation Propagation | Discrete constellation prior | 0.3β0.7 dB | |
| Sphere Decoder (ML) | Exponential (worst case) | Discrete constellation | 0 dB (reference) |
Why This Matters: From EP to AMP and OAMP
When the MIMO channel matrix is i.i.d. Gaussian and dimensions grow, EP's expensive per-iteration matrix inversion becomes unnecessary: the Onsager-corrected AMP algorithm (Chapter 20) achieves the same state-evolution fixed point with cost. For structured channels (sparse, Kronecker, Fourier), OAMP/VAMP (Chapter 21) are the finite-dimensional generalizations. EP, AMP, and VAMP form a family of message-passing detectors that unify the iterative receiver design space.
See full treatment in Chapter 20