Regularized Zero-Forcing (MMSE Precoding)

Bridging MRT and ZF

MRT maximises signal power but ignores interference. ZF eliminates interference but amplifies noise. Is there a middle ground? Regularized zero-forcing (RZF), also known as MMSE precoding, adds a regularization term αI\alpha \mathbf{I} to the channel Gram matrix before inversion. By tuning α\alpha, we smoothly interpolate between MRT (α\alpha \to \infty) and ZF (α0\alpha \to 0), achieving the best SINR tradeoff at any operating point.

Definition:

Regularized Zero-Forcing (RZF) Precoding

The RZF precoding matrix is

WRZF=HH(HHH+αI)1DRZF\mathbf{W}^{\text{RZF}} = \mathbf{H}^{H} (\mathbf{H}\mathbf{H}^{H} + \alpha \mathbf{I})^{-1} \mathbf{D}_{\text{RZF}}

where α>0\alpha > 0 is the regularization parameter and DRZF\mathbf{D}_{\text{RZF}} is a diagonal normalisation matrix ensuring unit-norm columns.

The per-user (unnormalised) precoding vector is

v~kRZF=(HHH+αI)1hk\tilde{\mathbf{v}}_k^{\text{RZF}} = (\mathbf{H}^{H} \mathbf{H} + \alpha \mathbf{I})^{-1} \mathbf{h}_k

using the matrix inversion lemma to write the equivalent form.

When α=0\alpha = 0, RZF reduces to ZF. When α\alpha \to \infty, the inverse approaches (1/α)I(1/\alpha) \mathbf{I} and RZF reduces to MRT (up to scaling). The name "MMSE precoding" comes from the fact that the optimal α\alpha minimises the mean squared error between the transmitted and intended signals.

,

Theorem: Optimal Regularization Parameter

For i.i.d. Rayleigh fading with equal power allocation, the regularization parameter that maximises the asymptotic (large NtN_t) sum rate is

α=Kσ2Pt.\alpha^{\star} = \frac{K\, \sigma^2}{P_t}.

This is the ratio of total noise power (across all users) to the transmit power.

The optimal α\alpha balances two costs: too small an α\alpha causes noise amplification (like ZF), while too large an α\alpha permits too much interference (like MRT). The sweet spot is where the regularization equals the "noise per degree of freedom," which is Kσ2/PtK\sigma^2/P_t.

At high SNR (Pt/σ2P_t/\sigma^2 \to \infty), α0\alpha^{\star} \to 0 and RZF converges to ZF. At low SNR, α\alpha^{\star} is large and RZF behaves like MRT.

,

Theorem: RZF SINR Expression

With RZF precoding, regularization α\alpha, and equal power allocation pk=Pt/Kp_k = P_t/K, the SINR at user kk is

SINRkRZF=PtKhkH(HHH+αI)1hk2PtKjkhkH(HHH+αI)1hj2+σ2j=1K(HHH+αI)1hj2ck\text{SINR}_k^{\text{RZF}} = \frac{\frac{P_t}{K} |\mathbf{h}_k^H (\mathbf{H}^{H} \mathbf{H} + \alpha \mathbf{I})^{-1} \mathbf{h}_k|^2}{\frac{P_t}{K} \sum_{j \neq k} |\mathbf{h}_k^H (\mathbf{H}^{H} \mathbf{H} + \alpha \mathbf{I})^{-1} \mathbf{h}_j|^2 + \sigma^2 \sum_{j=1}^{K} \|(\mathbf{H}^{H} \mathbf{H} + \alpha \mathbf{I})^{-1} \mathbf{h}_j\|^2 \cdot c_k}

where ckc_k is a normalisation constant. In the large-system limit, this converges to a deterministic equivalent depending on α\alpha, K/NtK/N_t, and Pt/σ2P_t/\sigma^2.

The expression is complex but the message is simple: RZF trades off residual interference (nonzero for α>0\alpha > 0) against reduced noise amplification (better conditioned inverse). At α=α\alpha = \alpha^{\star}, the total "interference plus amplified noise" is minimised.

Example: Effect of Regularization on Sum Rate

For Nt=16N_t = 16, K=8K = 8, and Pt/σ2=15P_t/\sigma^2 = 15 dB, compute the sum rate Rsum=k=1Klog2(1+SINRk)R_{\text{sum}} = \sum_{k=1}^{K} \log_2(1 + \text{SINR}_k) for α{0,0.1,0.25,0.5,1,10}\alpha \in \{0, 0.1, 0.25, 0.5, 1, 10\} via Monte Carlo simulation. Verify that the optimum is near α=Kσ2/Pt=8/31.60.25\alpha^{\star} = K\sigma^2/P_t = 8/31.6 \approx 0.25.

RZF Sum Rate vs Regularization α\alpha

Sweep the regularization parameter α\alpha and observe the sum rate. The vertical dashed line marks the optimal α=Kσ2/Pt\alpha^{\star} = K\sigma^2/P_t. Compare the sum rate at α=0\alpha = 0 (ZF) and α\alpha \to \infty (MRT).

Parameters
32
8
10

Sum Rate vs NtN_t — MRT, ZF, RZF

Compare the sum rate of MRT, ZF, and RZF (with optimal α\alpha) as the number of antennas grows. Observe that all three converge in the massive regime but differ significantly at moderate antenna counts.

Parameters
8
10

MRT vs ZF vs RZF — Summary

PropertyMRTZFRZF (MMSE)
Precoding vectorhk/hk\mathbf{h}_k/\|\mathbf{h}_k\|[HH(HHH)1]:,k[\mathbf{H}^{H}(\mathbf{H}\mathbf{H}^{H})^{-1}]_{:,k} (normalised)[HH(HHH+αI)1]:,k[\mathbf{H}^{H}(\mathbf{H}\mathbf{H}^{H} + \alpha\mathbf{I})^{-1}]_{:,k} (normalised)
InterferenceNonzero (ignored)ZeroSmall (controlled)
Noise amplificationNoneSevere when KoNtK o N_tModerate (regularized)
ComplexityO(NtK)O(N_t K)O(NtK2+K3)O(N_t K^2 + K^3)O(NtK2+K3)O(N_t K^2 + K^3)
Best regimeNtKN_t \gg K, low SNRNtKN_t \gg K, high SNRAll regimes
RequiresChannel vectorsFull CSI + inversionFull CSI + inversion + α\alpha

Efficient RZF Precoder Computation

Complexity: O(NtK2+K3)O(N_tK^{2} + K^{3}), dominated by the matrix-matrix product in step 1 and the Cholesky factorisation in step 2. This is feasible for real-time operation with K64K \leq 64 and Nt256N_t \leq 256 on modern DSP hardware.
Input: Channel matrix HCK×Nt\mathbf{H} \in \mathbb{C}^{K \times N_t},
regularization α>0\alpha > 0, power budget PtP_t
1. Compute Gram matrix: G=HHH+αIK\mathbf{G} = \mathbf{H}\mathbf{H}^{H} + \alpha \mathbf{I}_{K}
\quad // O(NtK2)O(N_tK^{2})
2. Cholesky factorisation: G=LLH\mathbf{G} = \mathbf{L}\mathbf{L}^H
\quad // O(K3)O(K^{3})
3. Solve LLHB=IK\mathbf{L}\mathbf{L}^H \mathbf{B} = \mathbf{I}_{K} for B=G1\mathbf{B} = \mathbf{G}^{-1}
\quad // O(K3)O(K^{3}) via back-substitution
4. Form unnormalised precoders: W~=HHB\tilde{\mathbf{W}} = \mathbf{H}^{H} \mathbf{B}
\quad // O(NtK2)O(N_tK^{2})
5. Normalise: vk=v~k/v~k\mathbf{v}_{k} = \tilde{\mathbf{v}}_k / \|\tilde{\mathbf{v}}_k\| for k=1,,Kk = 1, \ldots, K
6. Allocate power: pk=Pt/Kp_k = P_t/K (equal allocation)
Output: Precoding vectors v1,,v\ntnnusers\mathbf{v}_{1}, \ldots, \mathbf{v}_{\ntn{nusers}} and powers p1,,pKp_1, \ldots, p_{K}

Using the matrix inversion lemma, one can equivalently compute via the Nt×NtN_t \times N_t matrix HHH+αINt\mathbf{H}^{H}\mathbf{H} + \alpha \mathbf{I}_{N_t}, which is preferred when K>NtK > N_t (rare in practice).

⚠️Engineering Note

Estimating α\alpha in Practice

The theoretical optimum α=Kσ2/Pt\alpha^{\star} = K\sigma^2/P_t assumes i.i.d. Rayleigh fading with perfect CSI. In practice:

  • Noise variance estimation: σ2\sigma^2 is estimated from noise-only subcarriers or the off-diagonal elements of the received signal covariance. A 1--2 dB error in σ2^\hat{\sigma^2} shifts α\alpha by the same factor.

  • Correlated channels: With spatial correlation, the optimal α\alpha depends on the eigenvalue spread of HHH\mathbf{H}\mathbf{H}^{H}. A practical rule is to use α=tr(HHH)σ2/(KPt)\alpha = \text{tr}(\mathbf{H}\mathbf{H}^{H}) \cdot \sigma^2/(K\,P_t).

  • Imperfect CSI: When the channel is estimated with error variance σe2\sigma_e^2, the effective regularization should be increased: αeff=K(σ2+Ptσe2)/Pt\alpha_{\text{eff}} = K(\sigma^2 + P_t\sigma_e^2)/P_t.

Historical Note: The MMSE Precoding Lineage

2003--2012

The idea of regularized channel inversion appeared independently in several groups around 2003--2005. Joham, Utschick, and Nossek (2005) derived it from the MMSE criterion for the transmit signal. Peel, Hochwald, and Swindlehurst (2005) approached it from the "vector perturbation" perspective, showing that linear regularized inversion is the first step toward nonlinear precoding. The large-system analysis by Wagner, Couillet, Debbah, and Slock (2012) provided the deterministic equivalent that made the optimal α\alpha analytically tractable in the massive MIMO regime.

,

Quick Check

What happens to the RZF precoding matrix as α\alpha \to \infty?

It converges to the ZF precoder

It converges to scaled MRT (conjugate beamforming)

It converges to the identity matrix

It diverges

Regularized Zero-Forcing (RZF)

Linear precoding with regularization: W=HH(HHH+αI)1\mathbf{W} = \mathbf{H}^{H}(\mathbf{H}\mathbf{H}^{H} + \alpha\mathbf{I})^{-1}. Bridges MRT (α\alpha \to \infty) and ZF (α=0\alpha = 0). Also called MMSE precoding. Optimal α=Kσ2/Pt\alpha = K\sigma^2/P_t.

Related: Maximum Ratio Transmission (MRT), Zero-Forcing (ZF) Precoding

Regularization Parameter

A positive scalar α\alpha added to the diagonal of a matrix before inversion to improve numerical conditioning and balance noise amplification against residual interference. In RZF precoding, α\alpha controls the MRT--ZF tradeoff.

Key Takeaway

RZF/MMSE precoding is the practical workhorse of MU-MIMO. With optimal regularization α=Kσ2/Pt\alpha^{\star} = K\sigma^2/P_t, it achieves the best linear precoding performance at any SNR and loading. It dominates MRT at high SNR, dominates ZF at high loading, and matches both in their respectively optimal regimes.