ML Detection: The Benchmark

Why Start With ML

Maximum likelihood (ML) detection is the optimal detector in the sense that it minimizes the symbol error probability. It is also computationally infeasible for an OTFS frame of realistic size: the search space is ∣X∣MN|\mathcal{X}|^{MN}, where ∣X∣|\mathcal{X}| is the QAM constellation size and MNMN is the number of data symbols. For 4-QAM on an MN=1000MN = 1000 frame, this is 410004^{1000} β€” astronomical.

So why study ML? Because it defines the performance benchmark against which all practical detectors are measured. It also motivates the structure of the suboptimal schemes: MMSE (Section 2) linearizes ML; message-passing (Section 3) approximates ML with a factor-graph decomposition; LCD (Section 4) exploits the sparse structure to reduce complexity.

Definition:

ML Detection of the OTFS Frame

Given the received DD vector yDD∈CMN\mathbf{y}_{DD} \in \mathbb{C}^{MN}, the channel matrix HDD∈CMNΓ—MN\mathbf{H}_{DD} \in \mathbb{C}^{MN \times MN}, and the QAM constellation X\mathcal{X}, the maximum-likelihood detector is x^MLβ€…β€Š=β€…β€Šarg⁑min⁑x∈XMNβˆ₯yDDβˆ’HDD xβˆ₯2.\hat{\mathbf{x}}_{\text{ML}} \;=\; \arg\min_{\mathbf{x} \in \mathcal{X}^{MN}} \left\|\mathbf{y}_{DD} - \mathbf{H}_{DD}\,\mathbf{x}\right\|^2. The objective minimizes the Euclidean distance to the received signal over all possible transmit DD frames. Under i.i.d. Gaussian noise, ML coincides with MAP detection.

Theorem: ML Complexity Is Exponential in the Frame Size

The ML problem x^=arg⁑min⁑x∈XMNβˆ₯yβˆ’HDDxβˆ₯2\hat{\mathbf{x}} = \arg\min_{\mathbf{x} \in \mathcal{X}^{MN}} \|\mathbf{y} - \mathbf{H}_{DD}\mathbf{x}\|^2 has complexity O(∣X∣MN)O(|\mathcal{X}|^{MN}) in the worst case. For ∣X∣=4|\mathcal{X}| = 4 (QPSK) and MN=100MN = 100, this is 4100β‰ˆ10604^{100} \approx 10^{60} β€” infeasible on any computer.

The data vector has MNMN entries, each from a size-∣X∣|\mathcal{X}| alphabet. Brute-force enumeration is therefore exponential. Sphere decoding can reduce average complexity (often called O(McN)O(M^{cN}) for moderate grids) but remains exponential in the worst case.

Key Takeaway

ML is a benchmark, not a practical detector. The MNMN-dimensional joint-constellation search makes ML prohibitively expensive for any realistic OTFS frame. Practical detectors β€” MMSE, MP, LCD β€” are designed as computationally feasible approximations. The challenge of this chapter is how closely these approximations can track the ML benchmark.

Sparsity Is the Enabling Structure

The ML objective βˆ₯yβˆ’HDDxβˆ₯2\|\mathbf{y} - \mathbf{H}_{DD}\mathbf{x}\|^2 involves the channel matrix HDD\mathbf{H}_{DD}, which is sparse (Chapter 4). Each DD cell yDD[β„“,k]y_{DD}[\ell, k] couples to at most PP transmit cells β€” the PP paths shifted to (β„“,k)(\ell, k). This sparse coupling is what makes message-passing (Section 3) tractable: the factor graph degree is bounded by PP, which is small (≀20\leq 20) even though the graph has MNMN nodes.

In contrast, OFDM under high Doppler has a dense TF channel matrix, with each subcarrier coupled to all others via ICI. Message-passing on a dense factor graph is intractable. The DD domain's sparsity is the structural reason OTFS detection is practical.

Example: ML Detection for M=N=2M = N = 2

A 2Γ—2 OTFS frame uses QPSK symbols (∣X∣=4|\mathcal{X}| = 4). The channel has 1 LOS path (perfect identity). Compute the ML search space size and the received signal structure.

ML Complexity as MNMN Grows

Plot log⁑10∣X∣MN\log_{10}|\mathcal{X}|^{MN} against MNMN for QPSK, 16-QAM, 64-QAM. At MN=100MN = 100, QPSK needs 106010^{60} candidates; 64-QAM needs 1018010^{180}. Compare with practical computer limits (~101810^{18}) to see where ML becomes infeasible. The cutoff is around MN=30MN = 30 for QPSK, MN=15MN = 15 for 64-QAM.

Parameters
100

The OTFS Detector Landscape

The OTFS Detector Landscape
The space of OTFS detectors organized by complexity and performance. ML (top-right): optimal but exponential complexity. Sphere decoding: sub-exponential expected, exponential worst-case. Message passing (MP-OTFS): near-ML at O(P2MN)O(P^2 MN) per iteration. LMMSE: linear, O(MNlog⁑(MN))O(MN\log(MN)) via the 2D DFT diagonalization. LCD: linear with clever cross-domain structure, O(MNlog⁑(MN))O(MN\log(MN)). Practical OTFS receivers sit on the efficient frontier of this landscape.
⚠️Engineering Note

Realtime Complexity Budget

For a 5G NR-aligned OTFS frame with MN=8000MN = 8000 and 100 frames/sec, the per-frame detection budget on commodity silicon is roughly 10810^8 floating-point operations. This translates to:

  • MMSE: O(MNlog⁑(MN))∼105O(MN \log(MN)) \sim 10^5 ops. Feasible.
  • MP: O(P2β‹…MNβ‹…Titer)∼107O(P^2 \cdot MN \cdot T_{\text{iter}}) \sim 10^7 ops. Feasible.
  • ML: O(4MN)O(4^{MN}) β€” completely infeasible.

The gap between MMSE and ML in BER is small (typically 11–33 dB) for moderate constellations. The gap closes to <1< 1 dB with message-passing. The remaining gap is bridged only by MAP-like lattice search, which is impractical.

Detection in OTFS is a complexity-limited, not a BER-limited, problem. Practical deployments use MMSE or MP; sphere decoding is reserved for small frames (e.g., pilot-only regions in superimposed designs).

Practical Constraints
  • β€’

    MMSE: O(MNlog⁑(MN))O(MN \log(MN)) via 2D FFT β€” ∼10βˆ’5\sim 10^{-5} ops per data symbol

  • β€’

    MP (Chapter 8.3): O(P2MNTiter)O(P^2 MN T_{\text{iter}}) per frame

  • β€’

    Realtime target: ∼108\sim 10^8 ops per frame on GPU-accelerated hardware

πŸ“‹ Ref: O-RAN WG5

Historical Note: Sphere Decoding: An Honest Detour

2000s

Sphere decoding, introduced in the late 1990s by Pohst and adapted to communications by Viterbo, Boutros, and others, gave the illusion of practical ML detection for MIMO systems in the 2000s. Average complexity was claimed to be polynomial for moderate matrix sizes. Subsequent work by JaldΓ©n and Ottersten (2005) clarified that expected complexity is still exponential in the problem dimension, with the polynomial regime valid only at high SNR and small dimensions.

For OTFS, sphere decoding is possible but inefficient: the matrix is MNΓ—MNMN \times MN, large, and the path count PP does not directly translate to the lattice-search speed. The MP-OTFS detector (Section 3) achieves better BER-complexity trade-offs because it exploits the sparse factor-graph structure, not the lattice geometry. This is why OTFS detection literature has converged on MP rather than sphere decoding since Raviteja et al. (2018).

Why This Matters: Link to MIMO Detection

The OTFS detection problem is mathematically similar to MIMO detection: a linear equation y=Hx+w\mathbf{y} = \mathbf{H}\mathbf{x} + \mathbf{w} with a discrete constellation constraint on x\mathbf{x}. MIMO detection algorithms (V-BLAST, sphere decoding, MMSE) can all be adapted to OTFS. The key distinction is that OTFS's channel matrix has far more structure β€” sparse, block-circulant, diagonalizable by the 2D DFT. This structure makes MMSE efficient in OTFS (O(MNlog⁑(MN))O(MN\log(MN))) where it would be O((MN)3)O((MN)^3) for generic MIMO.