Cross-Domain and Iterative Detection

Why Integrate Detection With Decoding

The detectors of Sections 2-4 output hard or soft estimates of data symbols. In practice, a channel code (LDPC or Turbo in 5G NR) sits on top of these symbols. Iterative detection and decoding (IDD) loops information between the detector and the decoder: each pass refines the other. For OTFS, this structure yields the full ML performance at LCD-like complexity.

The point is that the detector provides symbol-level LLRs (log-likelihood ratios) to the decoder, which outputs extrinsic LLRs that the detector uses as priors in the next iteration. A handful of outer iterations is typically sufficient. The combined scheme is the standard operating point of 5G NR modems — and it ports directly to OTFS.

Definition:

Iterative Detection and Decoding Loop

The IDD loop for OTFS:

  1. Detection pass: detector produces soft LLRs λi\lambda_i for each bit of each symbol, given received yDD\mathbf{y}_{DD} and the channel estimate.
  2. Decoding pass: channel decoder (LDPC/Turbo) processes the LLRs and produces extrinsic LLRs λiext\lambda_i^{\text{ext}} (information from the decoder about each bit, excluding the detector's input).
  3. Feedback: the detector uses λiext\lambda_i^{\text{ext}} as a prior for the next iteration.
  4. Stopping: loop until convergence (LLRs stable) or hit a maximum iteration count (T=3T = 355 typically).

The final output is the decoded bits after the last decoding pass.

Theorem: The Turbo Principle Applied to OTFS

Let the detector produce LLRs λi\lambda_i and the decoder produce extrinsic LLRs λiext\lambda_i^{\text{ext}} per bit. Under mild regularity conditions (independent errors after detection, detector-decoder mutual information separation), the iterative scheme converges to a fixed point whose BER matches that of the joint ML detector-decoder within an ε\varepsilon-neighborhood — where ε\varepsilon shrinks exponentially in the iteration count.

EXIT chart analysis predicts convergence: plot the extrinsic-to- intrinsic-mutual-information transfer of each block; the iterated scheme converges iff the detector curve lies above the decoder curve in the relevant region.

The turbo principle is the standard iterative decoding framework from the late 1990s (turbo codes, iterative LDPC decoding). It applies directly to OTFS because the detector-decoder split is analogous to the turbo encoder-interleaver-decoder structure. The DD-domain detector is just one of the component decoders in a turbo-like loop.

,

Key Takeaway

IDD closes the ML gap at linear complexity. LDPC + LCD with iterative feedback achieves BER within 0.5 dB of ML at complexity O(TiterMNlog(MN))O(T_{\text{iter}}\,MN\log(MN)). For the 5G NR physical layer with Titer=3T_{\text{iter}} = 3, this is 3×105\sim 3 \times 10^5 ops per frame — well within realtime budgets. The full OTFS receiver (CP removal, Wigner, SFFT, IDD) is a 106\sim 10^6-ops-per-frame system, deployable on standard silicon.

OTFS Iterative Detection-Decoding (IDD)

Complexity: O(TouterMNlog(MN))O(T_{\text{outer}} \cdot MN \log(MN))
Input: Received DD grid YDDY_{DD}, channel estimate HDD\mathbf{H}_{DD},
outer iterations Touter=3T_{\text{outer}} = 3
Output: Decoded bits b^\hat{\mathbf{b}}
1. Initialize prior LLRs λiext=0\lambda_i^{\text{ext}} = 0 for all bits.
2. for outer iteration t=1,,Toutert = 1, \ldots, T_{\text{outer}} do
3. \quad Detection: compute soft LLRs λi\lambda_i for each bit,
given YDDY_{DD}, HDD\mathbf{H}_{DD}, and prior λiext\lambda_i^{\text{ext}}.
(LCD with soft priors, or MP with prior integration)
4. \quad Decoding: LDPC decoder processes all λi\lambda_i and
produces new extrinsic LLRs λiext\lambda_i^{\text{ext}}.
5. end for
6. Hard-quantize final extrinsic LLRs: b^i=sign(λiext)\hat{b}_i = \mathrm{sign}(\lambda_i^{\text{ext}}).
7. Return decoded bits b^\hat{\mathbf{b}}.

The inner LDPC decoder itself runs for several iterations (5-10) per outer pass. Total inner-loop complexity: 105\sim 10^5 ops per pass for standard 5G NR code rates. Over Touter=3T_{\text{outer}} = 3 passes, the total is 106\sim 10^6 ops.

IDD vs Non-Iterative: Coded BER Comparison

Compare BER curves for (a) non-iterative: LCD + LDPC, and (b) iterative: LCD + LDPC with 3 outer iterations. At low SNR, IDD shows 2\sim 233 dB gain; at high SNR, the gap closes as both reach the error floor. This illustrates the turbo-principle gain for OTFS.

Parameters
4
3

Definition:

Cross-Domain Detection

Cross-domain detection leverages complementary information from the TF and DD domains simultaneously. The receiver computes detection results in both domains and combines them:

  1. TF-domain detection: apply OFDM-like per-subcarrier MMSE on the TF grid. Accurate when H(f,t)|H(f, t)| is large; poor when it is small.
  2. DD-domain detection: apply LCD or MP on the DD grid. Accurate when the channel sparsity is effectively exploited.
  3. Fusion: combine TF and DD LLRs as weighted sum, with weights proportional to per-cell reliability.

The cross-domain fusion is a generalization of the iterative scheme: it lets each domain compensate for the other's weaknesses (TF handles well-conditioned cells; DD handles multipath-averaged cells).

When Cross-Domain Matters

Pure TF and pure DD detection have complementary failure modes:

  • TF fails at deep fades where H(f,t)|H(f, t)| is small.
  • DD fails at low SNR where the MP-OTFS algorithm has trouble distinguishing paths from noise.

Cross-domain detection wins when:

  • The channel has both multipath (favoring DD) and deep fades (hurting both — cross-domain LLR averaging helps).
  • The SNR is variable across cells (TF diversity + DD sparsity both needed).

In practice, pure LCD or MP with IDD outperforms cross-domain fusion for most deployment scenarios. Cross-domain is a research topic for rich-scattering environments (e.g., indoor mmWave) where the DD sparsity is less pronounced.

🔧Engineering Note

Complete OTFS Receiver Compute Budget

A complete OTFS receiver (5G NR-aligned, MN=4096MN = 4096, P=8P = 8):

  • CP removal + Wigner (OFDM demod): O(MNlogM)105O(MN\log M) \sim 10^5 ops.
  • SFFT: O(MNlog(MN))105O(MN\log(MN)) \sim 10^5 ops.
  • Channel estimation (Chapter 7): O(PMN)104O(P\,MN) \sim 10^4 ops.
  • LCD detection (3 iter): O(TiterMNlog(MN))3×105O(T_{\text{iter}}\,MN\log(MN)) \sim 3 \times 10^5 ops.
  • IDD feedback to LDPC (3 iter): 105\sim 10^5 ops.
  • Total: 106\sim 10^6 ops per frame.

At 100 frames/sec, this is 108\sim 10^8 ops/sec — easily handled by an OFDM-class modem silicon. The implementation is primarily 2D FFTs and element-wise operations; no exotic hardware required.

This confirms the deployment argument: OTFS can run on existing 5G modem silicon with firmware changes only.

Practical Constraints
  • Per-frame: 106\sim 10^6 ops

  • Per-second (100 frames/s): 108\sim 10^8 ops — modest

  • Implementable in software on existing OFDM modems

📋 Ref: O-RAN compute budget for eMBB
🎓CommIT Contribution(2023)

Distributed Message Passing for Cell-Free OTFS

M. Mohammadi, H. Q. Ngo, M. Matthaiou, G. CaireIEEE Trans. Wireless Communications

The MP-OTFS detector of Raviteja-Viterbo (2018) was originally a single-link algorithm. The CommIT extension — developed by Mohammadi, Ngo, Matthaiou, and Caire for cell-free massive MIMO — is distributed MP-OTFS: the message-passing algorithm runs partly at the distributed access points (near the observations) and partly at the central processing unit (for global consensus across multiple APs and UEs).

The key insight is that the DD factor graph generalizes naturally to multi-AP scenarios: each AP contributes its own set of factor nodes representing its local DD observations, and the CPU aggregates beliefs via a global message-passing step. Per-AP computation is O(PMN)O(P \cdot MN) (same as single-link); CPU aggregation is O(LMN)O(L \cdot MN) for LL APs. Total: O(LPMN)O(L \cdot P \cdot MN) — linear in the system scale.

This distributed MP framework is the receiver-side half of the cell-free OTFS contribution; the pilot-design half (superimposed pilots) is in Chapter 7. Together they deliver the 25-35% throughput gain over OFDM cell-free at vehicular mobility. Full treatment is in Chapter 17.

commitcell-freemp-otfsdetection

Why This Matters: Diversity Analysis Makes This Rigorous

The claim that MP and LCD achieve "full diversity PP" is made rigorous in Chapter 9. There, Surabhi-Chockalingam show that the pairwise error probability of any ML-like detector in OTFS decays as SNRd\mathrm{SNR}^{-d}, where dd equals the product of resolvable delay and Doppler bins occupied by the channel. For an integer- Doppler channel with PP distinct (i,ki)(\ell_i, k_i) pairs, d=Pd = P — matching the MP / LCD performance claims. The diversity-order result is the information-theoretic justification of the detectors in this chapter.