Ferkans — Interactive Telecom Tutor

Why Integrate Detection With Decoding

The detectors of Sections 2-4 output hard or soft estimates of data symbols. In practice, a channel code (LDPC or Turbo in 5G NR) sits on top of these symbols. Iterative detection and decoding (IDD) loops information between the detector and the decoder: each pass refines the other. For OTFS, this structure yields the full ML performance at LCD-like complexity.

The point is that the detector provides symbol-level LLRs (log-likelihood ratios) to the decoder, which outputs extrinsic LLRs that the detector uses as priors in the next iteration. A handful of outer iterations is typically sufficient. The combined scheme is the standard operating point of 5G NR modems — and it ports directly to OTFS.

Definition:
Iterative Detection and Decoding Loop

The IDD loop for OTFS:

Detection pass: detector produces soft LLRs $\lambda_i$ for each bit of each symbol, given received $\mathbf{y}_{DD}$ and the channel estimate.
Decoding pass: channel decoder (LDPC/Turbo) processes the LLRs and produces extrinsic LLRs $\lambda_i^{\text{ext}}$ (information from the decoder about each bit, excluding the detector's input).
Feedback: the detector uses $\lambda_i^{\text{ext}}$ as a prior for the next iteration.
Stopping: loop until convergence (LLRs stable) or hit a maximum iteration count ( $T = 3$ – $5$ typically).

The final output is the decoded bits after the last decoding pass.

Theorem: The Turbo Principle Applied to OTFS

Let the detector produce LLRs $\lambda_i$ and the decoder produce extrinsic LLRs $\lambda_i^{\text{ext}}$ per bit. Under mild regularity conditions (independent errors after detection, detector-decoder mutual information separation), the iterative scheme converges to a fixed point whose BER matches that of the joint ML detector-decoder within an $\varepsilon$ -neighborhood — where $\varepsilon$ shrinks exponentially in the iteration count.

EXIT chart analysis predicts convergence: plot the extrinsic-to- intrinsic-mutual-information transfer of each block; the iterated scheme converges iff the detector curve lies above the decoder curve in the relevant region.

The turbo principle is the standard iterative decoding framework from the late 1990s (turbo codes, iterative LDPC decoding). It applies directly to OTFS because the detector-decoder split is analogous to the turbo encoder-interleaver-decoder structure. The DD-domain detector is just one of the component decoders in a turbo-like loop.

Proof

EXIT analysis

Let $I_A$ denote the mutual information between the decoder's extrinsic LLR and the true bit. Let $I_E(I_A)$ be the detector's extrinsic LLR information given input $I_A$ . Convergence requires $I_E(I_A) > I_A$ for all $I_A < I_{\text{max}}$ .

OTFS-specific EXIT

The OTFS detector's $I_E(I_A)$ curve is determined by the channel sparsity and SNR. At high SNR, it is close to 1 for $I_A = 0$ (detector can resolve most symbols from noise). At low SNR, the curve sags, and convergence requires outer coding to lift $I_A$ above the detector's floor.

Practical convergence

In typical OTFS with LDPC outer coding, convergence is 2-3 iterations. Beyond that, diminishing returns.

BER achievability

At convergence, the BER is within 0.5 dB of the joint ML scheme for well-designed EXIT profiles. This is the operational OTFS receiver. $\blacksquare$

,

Key Takeaway

IDD closes the ML gap at linear complexity. LDPC + LCD with iterative feedback achieves BER within 0.5 dB of ML at complexity $O(T_{\text{iter}}\,MN\log(MN))$ . For the 5G NR physical layer with $T_{\text{iter}} = 3$ , this is $\sim 3 \times 10^5$ ops per frame — well within realtime budgets. The full OTFS receiver (CP removal, Wigner, SFFT, IDD) is a $\sim 10^6$ -ops-per-frame system, deployable on standard silicon.

OTFS Iterative Detection-Decoding (IDD)

Complexity:

O(T_{\text{outer}} \cdot MN \log(MN))

Input: Received DD grid

Y_{DD}

, channel estimate

\mathbf{H}_{DD}

,

outer iterations

T_{\text{outer}} = 3

Output: Decoded bits

\hat{\mathbf{b}}

1. Initialize prior LLRs

\lambda_i^{\text{ext}} = 0

for all bits.

2. for outer iteration

t = 1, \ldots, T_{\text{outer}}

do

3.

\quad

Detection: compute soft LLRs

\lambda_i

for each bit,

given

Y_{DD}

,

\mathbf{H}_{DD}

, and prior

\lambda_i^{\text{ext}}

.

(LCD with soft priors, or MP with prior integration)

4.

\quad

Decoding: LDPC decoder processes all

\lambda_i

and

produces new extrinsic LLRs

\lambda_i^{\text{ext}}

.

5. end for

6. Hard-quantize final extrinsic LLRs:

\hat{b}_i = \mathrm{sign}(\lambda_i^{\text{ext}})

.

7. Return decoded bits

\hat{\mathbf{b}}

.

The inner LDPC decoder itself runs for several iterations (5-10) per outer pass. Total inner-loop complexity: $\sim 10^5$ ops per pass for standard 5G NR code rates. Over $T_{\text{outer}} = 3$ passes, the total is $\sim 10^6$ ops.

IDD vs Non-Iterative: Coded BER Comparison

Compare BER curves for (a) non-iterative: LCD + LDPC, and (b) iterative: LCD + LDPC with 3 outer iterations. At low SNR, IDD shows $\sim 2$ – $3$ dB gain; at high SNR, the gap closes as both reach the error floor. This illustrates the turbo-principle gain for OTFS.

Parameters

P

4

Outer code rate

Outer iterations3

Definition:
Cross-Domain Detection

Cross-domain detection leverages complementary information from the TF and DD domains simultaneously. The receiver computes detection results in both domains and combines them:

TF-domain detection: apply OFDM-like per-subcarrier MMSE on the TF grid. Accurate when $|H(f, t)|$ is large; poor when it is small.
DD-domain detection: apply LCD or MP on the DD grid. Accurate when the channel sparsity is effectively exploited.
Fusion: combine TF and DD LLRs as weighted sum, with weights proportional to per-cell reliability.

The cross-domain fusion is a generalization of the iterative scheme: it lets each domain compensate for the other's weaknesses (TF handles well-conditioned cells; DD handles multipath-averaged cells).

When Cross-Domain Matters

Pure TF and pure DD detection have complementary failure modes:

TF fails at deep fades where $|H(f, t)|$ is small.
DD fails at low SNR where the MP-OTFS algorithm has trouble distinguishing paths from noise.

Cross-domain detection wins when:

The channel has both multipath (favoring DD) and deep fades (hurting both — cross-domain LLR averaging helps).
The SNR is variable across cells (TF diversity + DD sparsity both needed).

In practice, pure LCD or MP with IDD outperforms cross-domain fusion for most deployment scenarios. Cross-domain is a research topic for rich-scattering environments (e.g., indoor mmWave) where the DD sparsity is less pronounced.

🔧Engineering Note

Complete OTFS Receiver Compute Budget

A complete OTFS receiver (5G NR-aligned, $MN = 4096$ , $P = 8$ ):

CP removal + Wigner (OFDM demod): $O(MN\log M) \sim 10^5$ ops.
SFFT: $O(MN\log(MN)) \sim 10^5$ ops.
Channel estimation (Chapter 7): $O(P\,MN) \sim 10^4$ ops.
LCD detection (3 iter): $O(T_{\text{iter}}\,MN\log(MN)) \sim 3 \times 10^5$ ops.
IDD feedback to LDPC (3 iter): $\sim 10^5$ ops.
Total: $\sim 10^6$ ops per frame.

At 100 frames/sec, this is $\sim 10^8$ ops/sec — easily handled by an OFDM-class modem silicon. The implementation is primarily 2D FFTs and element-wise operations; no exotic hardware required.

This confirms the deployment argument: OTFS can run on existing 5G modem silicon with firmware changes only.

Practical Constraints

•
Per-frame: $\sim 10^6$ ops
•
Per-second (100 frames/s): $\sim 10^8$ ops — modest
•
Implementable in software on existing OFDM modems

📋 Ref: O-RAN compute budget for eMBB

🎓CommIT Contribution(2023)

Distributed Message Passing for Cell-Free OTFS

M. Mohammadi, H. Q. Ngo, M. Matthaiou, G. Caire — IEEE Trans. Wireless Communications

The MP-OTFS detector of Raviteja-Viterbo (2018) was originally a single-link algorithm. The CommIT extension — developed by Mohammadi, Ngo, Matthaiou, and Caire for cell-free massive MIMO — is distributed MP-OTFS: the message-passing algorithm runs partly at the distributed access points (near the observations) and partly at the central processing unit (for global consensus across multiple APs and UEs).

The key insight is that the DD factor graph generalizes naturally to multi-AP scenarios: each AP contributes its own set of factor nodes representing its local DD observations, and the CPU aggregates beliefs via a global message-passing step. Per-AP computation is $O(P \cdot MN)$ (same as single-link); CPU aggregation is $O(L \cdot MN)$ for $L$ APs. Total: $O(L \cdot P \cdot MN)$ — linear in the system scale.

This distributed MP framework is the receiver-side half of the cell-free OTFS contribution; the pilot-design half (superimposed pilots) is in Chapter 7. Together they deliver the 25-35% throughput gain over OFDM cell-free at vehicular mobility. Full treatment is in Chapter 17.

commitcell-freemp-otfsdetection

Why This Matters: Diversity Analysis Makes This Rigorous

The claim that MP and LCD achieve "full diversity $P$ " is made rigorous in Chapter 9. There, Surabhi-Chockalingam show that the pairwise error probability of any ML-like detector in OTFS decays as $\mathrm{SNR}^{-d}$ , where $d$ equals the product of resolvable delay and Doppler bins occupied by the channel. For an integer- Doppler channel with $P$ distinct $(\ell_i, k_i)$ pairs, $d = P$ — matching the MP / LCD performance claims. The diversity-order result is the information-theoretic justification of the detectors in this chapter.

Cross-Domain and Iterative Detection