Achievable Rates with ZF and MMSE

Beyond MRC: Interference Suppression

MRC treats each user independently, ignoring multi-user interference. When KK is not negligible compared to NtN_t, this interference dominates and the MRC rate saturates. Two alternatives exploit the excess spatial degrees of freedom to suppress interference:

  • Zero Forcing (ZF): Projects each user's signal onto the subspace orthogonal to all other users' estimated channels. This eliminates interference completely (given perfect estimates) but amplifies noise because the projection reduces the effective array dimension.

  • MMSE (Regularized ZF): Adds a regularization term that balances interference suppression against noise amplification. This is optimal among linear receivers in the MSE sense.

Both require inverting a KΓ—KK \times K matrix, but the payoff is dramatic: the per-user rate with ZF/MMSE can be substantially higher than MRC, especially when KK approaches NtN_t.

,

Definition:

Zero-Forcing Combining Vector

Let H^=[H^1,…,H^K]∈CNtΓ—K\hat{\mathbf{H}} = [\hat{\mathbf{H}}_1, \ldots, \hat{\mathbf{H}}_{K}] \in \mathbb{C}^{N_t \times K} be the matrix of channel estimates. The ZF combining vector for user kk is the kk-th column of

vZF=H^(H^HH^)βˆ’1,\mathbf{v}^{\text{ZF}} = \hat{\mathbf{H}} \left(\hat{\mathbf{H}}^H \hat{\mathbf{H}}\right)^{-1},

i.e., vkZF=H^(H^HH^)βˆ’1ek\mathbf{v}_{k}^{\text{ZF}} = \hat{\mathbf{H}} \left(\hat{\mathbf{H}}^H \hat{\mathbf{H}}\right)^{-1} \mathbf{e}_k, where ek\mathbf{e}_k is the kk-th standard basis vector.

ZF requires Ntβ‰₯KN_t \geq K for the pseudo-inverse to exist. In massive MIMO, Nt≫KN_t \gg K is the typical operating regime, so this condition is easily satisfied.

Definition:

MMSE (Regularized ZF) Combining Vector

The MMSE combining vector for user kk is

vkMMSE=(βˆ‘j=1KPtjH^jH^jH+Z)βˆ’1Ptk H^k,\mathbf{v}_{k}^{\text{MMSE}} = \left(\sum_{j=1}^{K} {P_t}_{j} \hat{\mathbf{H}}_j \hat{\mathbf{H}}_j^H + \mathbf{Z}\right)^{-1} \sqrt{{P_t}_{k}} \, \hat{\mathbf{H}}_k,

where Z=βˆ‘j=1KPtj(Ξ²jβˆ’Ξ³j)I+Οƒ2I\mathbf{Z} = \sum_{j=1}^{K} {P_t}_{j} (\beta_{j} - \gamma_j) \mathbf{I} + \sigma^2 \mathbf{I} accounts for the estimation error variance and noise.

When Οƒ2β†’0\sigma^2 \to 0 and the estimation is perfect (Ξ³k=Ξ²k\gamma_k = \beta_{k}), the MMSE combiner reduces to ZF. The regularization term Z\mathbf{Z} prevents noise amplification by keeping the matrix inversion well-conditioned.

,

Theorem: Closed-Form UatF Rate with ZF Combining

Under i.i.d. Rayleigh fading with MMSE estimation and no pilot contamination, the UatF achievable rate for user kk with ZF combining is

RkZF=log⁑2 ⁣(1+(Ntβˆ’K) Ptk γkβˆ‘j=1KPtj(Ξ²jβˆ’Ξ³j)+Οƒ2),R_k^{\text{ZF}} = \log_2\!\left(1 + \frac{(N_t - K) \, {P_t}_{k} \, \gamma_k}{\sum_{j=1}^{K} {P_t}_{j} (\beta_{j} - \gamma_j) + \sigma^2}\right),

valid for Nt>KN_t > K.

Comparing with the MRC formula, two changes stand out:

  1. The array gain is Ntβˆ’KN_t - K instead of NtN_t. ZF "uses up" Kβˆ’1K - 1 degrees of freedom to null interference, leaving Ntβˆ’KN_t - K for coherent combining. When Nt≫KN_t \gg K, the loss is negligible.

  2. The denominator contains Ξ²jβˆ’Ξ³j\beta_{j} - \gamma_j (the estimation error variance) instead of Ξ²j\beta_{j} (the full channel power). ZF eliminates the component of interference along the estimated channels β€” only the part due to estimation error remains. With perfect estimation (Ξ³j=Ξ²j\gamma_j = \beta_{j}), the interference vanishes entirely and the denominator is just Οƒ2\sigma^2.

,

Theorem: UatF Rate with MMSE Combining (Large-System Approximation)

Under i.i.d. Rayleigh fading in the large-system limit (Nt,Kβ†’βˆžN_t, K \to \infty with K/Ntβ†’Ξ±βˆˆ(0,1)K/N_t \to \alpha \in (0,1)), the UatF achievable rate with MMSE combining converges to

RkMMSEβ†’log⁑2 ⁣(1+PtkΞ³k mk(βˆ’Οƒ2)),R_k^{\text{MMSE}} \to \log_2\!\left(1 + {P_t}_{k} \gamma_k \, m_k(-\sigma^2)\right),

where mk(z)m_k(z) is the Stieltjes transform of the asymptotic eigenvalue distribution of the interference-plus-noise covariance matrix, evaluated at z=βˆ’Οƒ2z = -\sigma^2. For equal power and equal path loss, this simplifies to

RkMMSEβ‰ˆlog⁑2 ⁣(1+Nt Pt γK Pt(Ξ²βˆ’Ξ³)+Οƒ2β‹…11+Ξ±β‹…Ξ΄),R_k^{\text{MMSE}} \approx \log_2\!\left(1 + \frac{N_t \, P_t \, \gamma}{K \, P_t (\beta - \gamma) + \sigma^2} \cdot \frac{1}{1 + \alpha \cdot \delta}\right),

where Ξ΄\delta is the unique positive solution to Ξ΄=PtΞ³/(Οƒ2+KPt(Ξ²βˆ’Ξ³)/Nt+Ptγ α δ/(1+Ξ΄))\delta = P_t \gamma / (\sigma^2 + K P_t(\beta - \gamma) / N_t + P_t \gamma \, \alpha \, \delta / (1 + \delta)).

The MMSE rate interpolates between MRC (when Οƒ2\sigma^2 dominates) and ZF (when interference dominates). The random matrix theory machinery is needed because the MMSE combiner couples all users through the matrix inversion, making the per-user SINR depend on the joint statistics of all channels.

For practical purposes, when Nt/Kβ‰₯5N_t / K \geq 5, the MMSE rate is very close to the ZF rate, and the simpler ZF formula suffices for system design.

,

Historical Note: Random Matrix Theory Enters Wireless Communications

1996-2013

The application of random matrix theory (RMT) to wireless communications began with the landmark papers of Telatar (1999) and Foschini (1996), who showed that MIMO capacity scales linearly with the minimum of NtN_t and NrN_r. The Marchenko-Pastur law, originally developed in the context of nuclear physics (1967), became an essential tool for analyzing large MIMO systems.

Hoydis, ten Brink, and Debbah (2013) brought RMT-based analysis to massive MIMO, deriving deterministic equivalents for the SINR under MMSE processing. Their results showed that RMT predictions are accurate even for modest system sizes (Nt=64N_t = 64, K=16K = 16), vindicating the large-system approach for practical 5G design.

,

Historical Note: From Interference Cancellation to Zero Forcing

1990s-2010s

Zero-forcing detectors have a long history in communications, dating back to the equalization of inter-symbol interference in single-antenna channels. The extension to multi-user MIMO was developed in the early 2000s as part of the BLAST architecture at Bell Labs. The key realization for massive MIMO was that ZF β€” previously considered impractical due to the matrix inversion β€” becomes nearly optimal and computationally feasible when Nt≫KN_t \gg K, because the Gram matrix H^HH^/Nt\hat{\mathbf{H}}^H \hat{\mathbf{H}} / N_t converges to a well-conditioned diagonal matrix.

MRC vs. ZF vs. MMSE Combining

PropertyMRCZFMMSE
Combining vectorH^k\hat{\mathbf{H}}_kH^(H^HH^)βˆ’1ek\hat{\mathbf{H}}(\hat{\mathbf{H}}^H \hat{\mathbf{H}})^{-1} \mathbf{e}_k(H^H^H+Ξ±I)βˆ’1H^k(\hat{\mathbf{H}}\hat{\mathbf{H}}^H + \alpha \mathbf{I})^{-1} \hat{\mathbf{H}}_k
Array gainNtN_tNtβˆ’KN_t - Kβ‰ˆNtβˆ’K\approx N_t - K (large system)
Interference handlingNone (interference-limited)Fully eliminated (estimated part)Optimally balanced
Denominator noiseβˆ‘jPtjΞ²j+Οƒ2\sum_j {P_t}_{j} \beta_{j} + \sigma^2βˆ‘jPtj(Ξ²jβˆ’Ξ³j)+Οƒ2\sum_j {P_t}_{j}(\beta_{j} - \gamma_j) + \sigma^2Implicitly via Stieltjes transform
Complexity per symbolO(NtK)O(N_t K)O(NtK+K3)O(N_tK + K^{3})O(NtK+K3)O(N_tK + K^{3})
Best regimeNt≫KN_t \gg KNt>2KN_t > 2KAll regimes

ZF and MMSE SINR vs. Number of Antennas

Compare ZF and MMSE achievable rates as a function of NtN_t. Notice how ZF loses degrees of freedom (the curve starts at Nt=K+1N_t = K + 1) while MMSE gracefully handles the transition.

Parameters
10
10
-10

Example: When Does ZF Outperform MRC?

For equal power, equal path loss, and perfect estimation (Ξ³k=Ξ²\gamma_k = \beta), find the condition on NtN_t, KK, and SNR=PtΞ²/Οƒ2\text{SNR} = P_t \beta / \sigma^2 under which RkZF>RkMRCR_k^{\text{ZF}} > R_k^{\text{MRC}}.

Sum Rate Comparison: MRC vs. ZF vs. MMSE

Compare the sum spectral efficiency of all three combining schemes as a function of the number of users KK for a fixed NtN_t. MRC saturates at high KK while ZF and MMSE continue to grow.

Parameters
128
10
-10

Common Mistake: Forgetting the Degrees-of-Freedom Loss in ZF

Mistake:

Writing the ZF SINR with array gain NtN_t instead of Ntβˆ’KN_t - K. This overestimates the ZF rate, especially when KK is not negligible compared to NtN_t.

Correction:

ZF projects onto the (Ntβˆ’K+1)(N_t - K + 1)-dimensional orthogonal complement of the other users' channels. The effective array gain is Ntβˆ’KN_t - K, not NtN_t. Always use the correct formula:

SINRkZF=(Ntβˆ’K) Ptk γkβˆ‘jPtj(Ξ²jβˆ’Ξ³j)+Οƒ2.\text{SINR}_k^{\text{ZF}} = \frac{(N_t - K) \, {P_t}_{k} \, \gamma_k}{\sum_j {P_t}_{j}(\beta_{j} - \gamma_j) + \sigma^2}.

Regularized Zero Forcing (RZF)

A linear combining/precoding scheme that adds a regularization (Tikhonov) term to the ZF pseudo-inverse: vk=(H^HH^+Ξ±I)βˆ’1H^kH\mathbf{v}_{k} = (\hat{\mathbf{H}}^H \hat{\mathbf{H}} + \alpha \mathbf{I})^{-1} \hat{\mathbf{H}}_k^H. The optimal regularization Ξ±\alpha equals KΟƒ2/PtK \sigma^2 / P_t, recovering the MMSE combiner. Also called MMSE combining.

Related: Effective SINR

Inverse Wishart Distribution

If X∈CnΓ—p\mathbf{X} \in \mathbb{C}^{n \times p} has i.i.d. CN(0,1)\mathcal{CN}(0,1) entries with n>pn > p, then (XHX)βˆ’1(\mathbf{X}^H \mathbf{X})^{-1} follows an inverse Wishart distribution. The diagonal entries have mean 1/(nβˆ’p)1/(n - p), which determines the ZF combining norm and hence the degrees-of-freedom loss.

Related: Regularized Zero Forcing (RZF)

Quick Check

With perfect channel estimation (Ξ³k=Ξ²k\gamma_k = \beta_{k} for all kk), what does the ZF denominator reduce to?

βˆ‘jPtjΞ²j+Οƒ2\sum_j {P_t}_{j} \beta_{j} + \sigma^2

Οƒ2\sigma^2 only

βˆ‘jPtjΞ³j+Οƒ2\sum_j {P_t}_{j} \gamma_j + \sigma^2

Zero β€” ZF achieves infinite rate with perfect CSI

Key Takeaway

ZF combining eliminates inter-user interference at the cost of KK degrees of freedom, yielding an effective array gain of Ntβˆ’KN_t - K. MMSE combining optimally balances interference suppression and noise enhancement. Both achieve strictly higher rates than MRC whenever SNR>1/(Ntβˆ’K)\text{SNR} > 1/(N_t - K). The gap between ZF and MMSE is small when Nt/Kβ‰₯5N_t / K \geq 5.