Ferkans — Interactive Telecom Tutor

Exploiting the Sparse Factor Graph

The DD-domain input-output relation is a sparse equation: each received cell $y_{DD}[\ell, k]$ couples to at most $P$ transmit cells (one per path). This sparsity forms a sparse bipartite factor graph, and message-passing (MP) detection is the natural algorithm for such structures.

The point is that MP exchanges only sufficient statistics between graph nodes — a Gaussian approximation of the likelihood. Under the sparse factor-graph topology, this Gaussian message passing converges quickly and achieves near-ML performance. The algorithmic structure is exactly that of belief propagation (BP) from the FSI book, applied to the DD factor graph.

Definition:
The DD-Domain Factor Graph

The DD factor graph has two sets of nodes:

Variable nodes: one per transmitted DD cell, $x[\ell, k]$ for $\ell = 0, \ldots, M-1$ and $k = 0, \ldots, N-1$ . Total: $MN$ .
Factor nodes: one per received DD cell, representing the likelihood constraint $y[\ell, k] = \sum_{i=1}^P h_i\,e^{j\alpha_i}\,x[(\ell - \ell_i)\bmod M, (k - k_i)\bmod N] + w$ . Total: $MN$ .

Each factor node has exactly $P$ incident variable nodes (the $P$ shifted cells that contribute via each path). The graph is bipartite, sparse, with degree exactly $P$ on both sides.

Theorem: Convergence of Gaussian Message Passing

For a DD channel with $P$ paths and integer-Doppler alignment, the Gaussian sum-product algorithm on the DD factor graph converges within $T_{\text{iter}} = O(\log(MN))$ iterations to a fixed point whose per-cell SNR matches the ML asymptote. In the high-SNR regime, the MP detector achieves BER that scales as $\mathrm{SNR}^{-P}$ — full diversity $P$ .

The factor graph is sparse with degree $P$ . Gaussian messages (just mean and variance) propagate through the graph; at each iteration, each variable node aggregates messages from its $P$ neighbors and updates. Because the graph has no short cycles at typical scales, BP converges. The sparsity guarantees the per-iteration complexity is $O(P\,MN)$ , and the total is $O(P\,MN\,T_{\text{iter}})$ — roughly $10^5$ - $10^6$ ops for typical frame sizes.

Proof

Message updates

At iteration $t$ : Variable-to-factor: $\mu_{x \to y}^{(t)}$ encodes the posterior marginal belief about $x$ from all adjacent factors except $y$ . Factor-to-variable: $\mu_{y \to x}^{(t)}$ encodes the likelihood contribution from factor $y$ to variable $x$ given the messages from $y$ 's other neighbors.

Gaussian approximation

Messages are truncated to $\mathcal{CN}(\text{mean}, \text{variance})$ . This gives a tractable update: each message is two parameters.

Convergence

Under the sparse-graph structure with no short cycles, each iteration improves the mean estimates. Convergence in $O(\log(MN))$ iterations is guaranteed by standard BP theory on tree-like graphs (the DD factor graph is tree-like at realistic scales).

Diversity

At convergence, each variable node's posterior aggregates evidence from all $P$ paths. The effective SNR is $\sum_i |h_i|^2 \mathrm{SNR}_i$ , giving chi-squared- $2P$ fading — diversity $P$ . $\blacksquare$

MP-OTFS Detector

Complexity:

O(P \cdot MN \cdot T_{\text{iter}})

Input:

\mathbf{y}_{DD}

, channel

\{(h_i, \ell_i, k_i)\}

, noise

variance

\sigma^2

, QAM alphabet

\mathcal{X}

, iterations

T_{\text{iter}}

Output: Detected symbols

\hat{X}_{DD}

1. Initialize: For each variable

(\ell, k)

, set prior

p_{\ell, k}^{(0)}(x) = 1/|\mathcal{X}|

(uniform over

\mathcal{X}

).

2. for iteration

t = 1, \ldots, T_{\text{iter}}

do

3.

\quad

for each factor node

y[\ell, k]

do

4.

\quad\quad

Gather messages from the

P

incident variables.

5.

\quad\quad

Compute the factor-to-variable message for each

incident variable

v_i

:

\mu_{y \to v_i}(x) \propto \int p(y|\{v_j\})\,\prod_{j \neq i} \mu_{v_j \to y}^{(t-1)}(x_j)\,dx_j

6.

\quad

for each variable node

x[\ell, k]

do

7.

\quad\quad

Update posterior:

p_{\ell, k}^{(t)}(x) \propto \prod_{y \in \mathcal{N}(x)} \mu_{y \to x}^{(t)}(x)

.

8. end for

9. Compute final estimates:

\hat{x}[\ell, k] = \arg\max_{x \in \mathcal{X}} p_{\ell, k}^{(T_{\text{iter}})}(x)

.

10. Return

\hat{X}_{DD}

.

Gaussian simplification: messages are $(\mu, \sigma^2)$ pairs. The integral in step 5 reduces to a closed-form Gaussian update. Typical $T_{\text{iter}} = 5$ – $10$ ; no significant gain beyond. Overall complexity: $O(P\,MN\,T_{\text{iter}})$ , which for $P = 10, MN = 10^4, T_{\text{iter}} = 10$ is $10^6$ ops — feasible.

Key Takeaway

MP detection achieves full DD diversity. Unlike MMSE (diversity 1), message-passing on the sparse DD factor graph recovers the full $P$ -fold path diversity promised by Chapter 9. The BER slope is $\mathrm{SNR}^{-P}$ , matching ML. This is the main practical reason OTFS outperforms OFDM in uncoded or lightly-coded operation: MP exploits paths that OFDM's per-subcarrier detector cannot even see.

MP vs MMSE BER Comparison

Plot uncoded BER vs SNR for MMSE (diversity 1) and MP (diversity $P$ ) detection in the same OTFS frame. The MP curve is steeper — at low BER, MP can be 5-10 dB ahead of MMSE. Vary $P$ to see the diversity-order effect. This illustrates the qualitative difference between linear and nonlinear detection in OTFS.

Parameters

P

4

Min SNR (dB)0

Max SNR (dB)30

MP iterations

T_{\text{iter}}

8

Example: MP on a 2×2 Frame with 2 Paths

An OTFS frame with $M = N = 2$ and $|\mathcal{X}| =$ QPSK passes through a channel with 2 paths. Sketch the factor graph and count the number of messages exchanged in one iteration.

Solution

Graph structure

$MN = 4$ variable nodes, $MN = 4$ factor nodes. Each factor has $P = 2$ incident variables. Each variable has $P = 2$ incident factors (by symmetry). Total edges: $2 \cdot MN = 8$ .

Per-iteration messages

Factor-to-variable: one message per edge = 8 messages. Variable-to-factor: one message per edge = 8 messages. Total per iteration: 16 messages.

Complexity

Each message is a Gaussian (2 parameters) or a discrete distribution over $|\mathcal{X}| = 4$ (3 parameters after normalization). Per message: $O(P)$ arithmetic. Per iteration: $O(P \cdot MN \cdot |\mathcal{X}|) = O(32)$ ops. For $T_{\text{iter}} = 5$ : 160 ops total. Fast.

Message Passing on a 4×4 DD Factor Graph

Animation of the MP-OTFS detector on a small 4×4 OTFS frame with

P = 3

paths. Variable nodes (DD cells) and factor nodes (input-output constraints) are displayed, with messages flowing between them. Over 5 iterations, the posterior beliefs concentrate on the true transmitted symbols (color intensifies on correct cells). This illustrates how MP converges to the ML solution.

MP Variants: Sum-Product, Max-Product, and AMP

Several flavors of message-passing are used in the OTFS literature:

Sum-product (exact BP): messages are full posteriors; gives symbol-wise MAP decisions. High memory, $O(|\mathcal{X}|)$ per message.
Gaussian BP (MP-OTFS): truncates messages to Gaussian $(\mu, \sigma^2)$ . Most efficient, close to ML at high SNR. This is the version of Algorithm AMP-OTFS Detector.
AMP (approximate message passing): for Gaussian inputs, uses matched-filter + soft-threshold. Asymptotically optimal for i.i.d. matrices; less studied for structured matrices like OTFS.
Max-product (min-sum): max rather than sum over each factor. Gives the most-likely sequence. Less accurate than sum-product but easier in noisy conditions.

For OTFS practice, Gaussian MP is the default. AMP has been explored (Chu, Yuan 2022) and shows ~1 dB gain over Gaussian MP in certain regimes, but at the cost of more sophisticated damping schemes to ensure stability.

⚠️Engineering Note

Practical MP Implementation

Key implementation decisions:

Damping factor $\gamma \in [0.5, 0.8]$ : soft-weight the message update as $\mu^{(t)} \leftarrow \gamma \mu^{(t-1)} + (1 - \gamma)\,\tilde{\mu}$ . Prevents oscillation in difficult channels. $\gamma = 0.6$ is a typical default.
Early termination: check convergence by computing $\max_x |p_{\ell, k}^{(t)}(x) - p_{\ell, k}^{(t-1)}(x)|$ . Stop when below a threshold (e.g., $10^{-3}$ ). Saves iterations for easy channels.
Parallelization: variable and factor updates can be fully parallelized within an iteration. GPU implementations achieve $10^7$ ops/μs on $MN = 10^4$ .
Numerical stability: work in log-domain to prevent underflow of message magnitudes at low SNR.

With these optimizations, MP-OTFS is readily realtime at 5G NR frame rates ( $\sim 100$ frames/s). For LEO satellite at larger $N$ ( $\sim 10^3$ ), MP remains feasible with modern GPU hardware.

Practical Constraints

•
Damping factor 0.6–0.8 for stability
•
Early termination saves $\sim 30\%$ iterations
•
GPU parallelization: 2-3 orders of magnitude over sequential

Common Mistake: Short Cycles Break MP at Extreme Parameters

Mistake:

Deploying MP-OTFS at small frame sizes ( $MN < 100$ ) expecting it to converge. At small $MN$ , the factor graph has short cycles (the same data symbol appears as a neighbor of multiple factors), and BP can oscillate or converge to wrong fixed points.

Correction:

For small frames, use sphere decoding or exhaustive ML (the search space is small anyway). MP is designed for $MN \gtrsim 10^3$ , where the graph is locally tree-like. At typical 5G NR-aligned OTFS sizes ( $MN \geq 10^3$ ), cycle effects are negligible and BP converges cleanly.

Message-Passing Detection