Deep Learning for OTFS Receivers

Why ML for OTFS?

Classical OTFS receivers are built from principled algorithms: ML detection, message passing, MMSE — each with well-characterized assumptions and performance bounds. They work, but they are not always optimal in practice. Real channels have imperfections: phase noise, hardware nonlinearities, interference, Doppler fractional offsets. Classical algorithms handle these imperfectly. Deep learning offers a complementary approach: learn the optimal detector from data, absorbing all the real-world non-idealities that analytical models ignore. This section surveys ML-based OTFS receivers and quantifies when they win.

Definition:
Machine Learning OTFS Receiver

A machine learning OTFS receiver replaces one or more of the receiver blocks with a learnable neural network:

NN channel estimator: takes received DD samples + pilot symbols as input, outputs channel estimate $\hat{h}$ .
NN detector: takes received DD samples + channel estimate, outputs soft symbol decisions $\hat{\mathbf{x}}$ .
Joint NN (end-to-end): single NN from received samples to hard decisions. No explicit intermediate steps.

Architectures:

Feedforward NN (MLP): simple, fast, limited expressivity.
Convolutional NN (CNN): exploits spatial structure in DD grid.
Transformer: attention over DD cells. Most expressive. Compute-heavy.
Graph NN: factor-graph-structured, combining physics with learning.

Theorem: ML Receiver Performance Bounds

An NN-based OTFS receiver with sufficient capacity can asymptotically achieve the same BER as the optimal (ML) detector: $\lim_{|\theta| \to \infty} \mathrm{BER}_{\mathrm{NN}}(\theta) \;=\; \mathrm{BER}_{\mathrm{ML}}.$ The convergence rate depends on training data size $N$ and NN capacity $|\theta|$ . Empirical: $O(N^{-1/2})$ for MSE convergence.

Practical performance (typical 2026 results):

NN detector vs MP at 15 dB SNR: $\leq 0.3$ dB gap (at converged training).
NN detector vs MP under fractional Doppler: NN improves by 1-2 dB (handles imperfections classical models don't capture).
NN detector at very high SNR: marginal gain (matches theory).

Consequence: NN receivers match or slightly beat classical detectors at typical operating points. Gain is largest where analytical models break (imperfections, finite-precision).

Neural networks are universal approximators — given enough capacity and data, they can learn any function, including the optimal detector. The question is whether this is practically useful. For idealized channels: NNs match classical but don't beat them. For real channels with non-idealities: NNs learn these patterns and beat classical. The more non-ideal the channel, the more ML wins.

Proof

Universal approximation

Cybenko 1989 / Hornik 1991: NNs with enough hidden units approximate any continuous function arbitrarily well. The optimal detector is such a function.

Training convergence

With stochastic gradient descent + adequate data, NN parameters converge to minimizer of training loss. For convex loss: global min; for non-convex (deep NN): local min.

Generalization

Test performance follows train performance with $O(N^{-1/2})$ gap (standard statistical learning bound).

Comparison to classical

Classical ML detector: optimal under known channel. NN detector: learns implicit channel + detector jointly. Under ideal assumptions: same performance. Under non-ideal: NN wins. $\blacksquare$

Key Takeaway

NN OTFS receivers match classical at idealized conditions and beat them at realistic conditions. The gain comes from learning non-idealities (fractional Doppler, hardware imperfections, non-Gaussian noise). At typical 6G operating points: $\sim 1-2$ dB improvement. Marginal for ideal, substantial for practical.

Definition:
CNN-Based OTFS Detector

A CNN detector for OTFS treats the DD grid as a 2D image:

Input: received DD samples $\mathbf{y}[\ell, m] \in \mathbb{C}$ (split into real/imaginary as 2 channels).
Architecture: convolutional layers (extract local DD features)
- attention (cross-DD relationships) + dense layer (per-cell detection).
Output: per-cell soft decision $\hat{x}[\ell, m]$ .

Why CNN? The DD channel is locally sparse — each path contributes to neighboring DD cells only. CNN's local receptive field matches this structure.

Typical architecture:

3-5 conv layers, 32-64 filters each.
2-3 attention layers, 4 heads.
1-2 dense layers, 128 units.
Total parameters: $\sim 10^5$ - $10^6$ . Trainable on modest GPU.

Theorem: CNN vs MP Performance

Under idealized OTFS channel (integer Doppler, Gaussian noise): CNN detector achieves BER within 0.2 dB of MP detector.

Under realistic conditions (fractional Doppler $\epsilon = 0.3$ , non-Gaussian noise, phase noise): CNN beats MP by 1-2 dB.

Under extreme conditions (fractional Doppler + phase noise + hardware distortion): CNN beats MP by 3-4 dB.

Compute: CNN training: 12 hours on 1 GPU (once). CNN inference: $\sim 10$ ms per frame on UE chip. MP inference: $\sim 2$ ms per frame. CNN is $5\times$ slower.

Trade-off: CNN is slower but more robust. Suited for high- value links (URLLC, safety-critical); classical for mass-scale (low-cost IoT).

The CNN wins exactly where classical MP assumes too much. Perfect Gaussian noise, integer Doppler, linear hardware — classical is matched. Real channels with fractional Doppler, nonlinear PAs, and complex noise — the CNN adapts. The trade-off is compute: 5× slower than MP, but still real-time-feasible.

Proof

Training

Train CNN on simulated OTFS data. Include realistic channel imperfections. CNN learns these as implicit features.

Idealized test

On integer Doppler + Gaussian: MP is Bayes-optimal. CNN converges to same. Gap: training noise, slight.

Realistic test

Fractional Doppler + other imperfections: MP mis-models. CNN adapts: learns true likelihood. 1-2 dB advantage.

Extreme test

Many imperfections: MP fails to converge or diverges. CNN remains robust. 3-4 dB.

Compute

CNN: 5× MP but $\sim 10$ ms latency. Acceptable for URLLC. $\blacksquare$

Example: CNN Receiver for 6G URLLC

Design a CNN-based OTFS receiver for 6G URLLC (V2X safety): target BER $10^{-5}$ at 20 dB SNR, $P = 6$ paths, fractional Doppler $\epsilon \leq 0.4$ , 1 ms latency budget.

Solution

Architecture

CNN: 4 conv layers (32 filters, kernel 3×3) → 2 transformer layers (heads 4) → dense (256 units). Output: 4 QPSK bits per DD cell.

Parameters

~500k parameters. Trainable in 6-12 hours on GPU.

Training

Data: 10⁶ OTFS frames with realistic imperfections. 50 epochs with Adam. Train to MSE $< 0.01$ .

Performance

BER at 20 dB SNR: $\sim 10^{-6}$ . Beats MP by 2 dB in fractional Doppler.

Latency

Inference: 8 ms on mmWave UE chip. Misses 1-ms target!

Mitigation

Use lighter architecture (2 conv layers + dense). Latency: 3 ms. BER: 1 dB worse but still meets $10^{-5}$ target. Depth-latency trade-off per URLLC sub-class.

ML vs Classical OTFS Detector BER

Plot BER vs SNR for MP detector, CNN detector, and NN detector with various architectures. Sliders: fractional Doppler, mobility, noise model.

Parameters

\epsilon

(fractional Doppler)0.2

Noise model

P

paths6

🔧Engineering Note

ML Receiver Deployment in 5G/6G

ML receiver deployment status (2026):

5G NR: limited ML in physical layer (vendor-proprietary for MIMO detection). No ML standardization.
5G Advanced (Rel. 18): AI/ML framework introduced. Channel feedback compression via NN. Experimental ML detectors.
6G Foundation (Rel. 21): AI/ML is native in the RAN. Standardized NN architectures for channel estimation, detection, resource allocation.
6G Deployment (Rel. 22+): ML receivers common. Combined with OTFS: NN handles DD-domain detection + pilot optimization.

Hardware: modern UE SoCs include AI/ML accelerators (Apple Neural Engine, Qualcomm Hexagon). $\sim 10$ - $100$ TOPS. Inference for OTFS-CNN: $\sim 1$ - $10$ ms feasible.

Privacy concerns: ML trained on UE channel data raises privacy issues (location inference). Mitigation: federated learning across UEs; training happens at UE without centralized data collection.

Adversarial robustness: NN receivers can be jammed by adversarial examples. Under active jamming: CNN robustness is $\sim 2$ dB better than MP (smoothing effect).

Practical Constraints

•
5G: vendor-proprietary ML (not standardized)
•
6G Rel. 21: native AI/ML framework
•
Hardware: UE AI accelerators (10-100 TOPS)
•
Privacy: federated learning for training

Common Mistake: NN Overfits to Training Channel

Mistake:

Training an NN OTFS receiver on one channel profile (e.g., 3GPP Urban Micro) and deploying in a different one (e.g., Rural Macro). The NN overfits to training statistics; out-of-distribution channels cause severe performance drops ( $\sim 5$ - $10$ dB).

Correction:

Train NN on diverse channel profiles covering realistic deployment scenarios. Include: 3GPP Urban Micro, Urban Macro, Rural Macro, Highway, LEO, and custom scenarios. Use domain randomization: randomly perturb channel parameters during training. Test on held-out channel profiles. For deployment: adaptive fine-tuning on current environment. Typical practice: 80% training

20% adaptation budget.

Prerequisites & Notation Learned Pilot Patterns

Deep Learning for OTFS Receivers

Why ML for OTFS?

Definition: Machine Learning OTFS Receiver

Theorem: ML Receiver Performance Bounds

Universal approximation

Training convergence

Generalization

Comparison to classical

Key Takeaway

Definition: CNN-Based OTFS Detector

Theorem: CNN vs MP Performance

Training

Idealized test

Realistic test

Extreme test

Compute

Example: CNN Receiver for 6G URLLC

Architecture

Parameters

Training

Performance

Latency

Mitigation

ML vs Classical OTFS Detector BER

Parameters

ML Receiver Deployment in 5G/6G

Common Mistake: NN Overfits to Training Channel

Definition:
Machine Learning OTFS Receiver

Definition:
CNN-Based OTFS Detector