Exercises
ex-otfs-ch21-01
EasyWhy would an NN OTFS receiver outperform classical MP detection? List three scenarios.
Non-idealities, data-driven learning.
Fractional Doppler
Classical MP assumes integer Doppler. NN learns non-integer structure.
Hardware imperfections
Phase noise, PA nonlinearity. NN absorbs these patterns.
Non-Gaussian noise
Impulsive noise, interference. NN learns non-Gaussian likelihoods.
Idealized
On perfect channel: NN matches classical. Gain comes from realism.
ex-otfs-ch21-02
EasyWhat is the structural difference between a pure NN detector and an unfolded MP detector?
Structure from classical algorithm.
Pure NN
No structural assumptions. Any function of received signal. parameters.
Unfolded MP
Inherits MP iteration structure. Per-iteration parameters are learnable. parameters.
Trade-off
Pure NN: more expressive, worse OOD. Unfolded: less expressive, better OOD, interpretable. Unfolded preferred for safety- critical; pure NN for stable conditions.
ex-otfs-ch21-03
EasyExplain the simulation-to-real gap in OTFS ML deployment. What mitigations exist?
Distribution shift from sim to real.
Gap
NN trained on simulated channels: 2-3 dB worse on real channels due to hardware imperfections, different path statistics, etc.
Domain randomization
Train on wide range of simulated parameters. Reduces gap to ~1 dB.
Domain adaptation
Fine-tune on small real-world sample. Recovers another 0.5 dB.
Residual gap
0.5-1 dB. Acceptable for deployment.
ex-otfs-ch21-04
MediumA CNN-based OTFS detector has 3 conv layers (32 filters each) + 2 dense layers (256 units). Estimate parameter count and training requirements.
Count weights per layer.
Conv layers
3 conv, 32 filters, kernel 3×3 on 2-channel input (I/Q): 32 × 2 × 9 = 576 weights + 32 biases per layer = 608. Layer 2 and 3: 32 × 32 × 9 = 9216 + 32 = 9248 each. Total conv: ~19k parameters.
Dense layers
Flatten from 32 × 16 × 16 = 8192 features (example MN=256). Dense 256: 8192 × 256 + 256 = 2.1M parameters. Dense output: 256 × 4 + 4 = 1028 (4 QAM bits).
Total
~2.1M parameters. Trainable on modern GPU in ~1 hour.
Training data
For 2M parameters: need ~10x = training samples. Simulated OTFS data: generate in 1-2 hours.
ex-otfs-ch21-05
MediumFor a learned pilot with 5 active DD cells out of 1024, what sparsity does it have and what's its regularization contribution?
nonzero count.
Sparsity
out of 1024. Fraction: 0.5%.
Regularization
. Small compared to other terms. Encourages more sparsity if small.
Effect
Sparse pilot: less interference with data; lower PAPR. Classical comparison: 1-3% pilot overhead. Learned: 0.5% (via encouragement).
ex-otfs-ch21-06
MediumDescribe how unfolded MP-OTFS inherits the robustness of classical MP while gaining NN flexibility.
Initialization at classical, training fine-tunes.
Initialization
Unfolded NN layer-by-layer initialized to match classical MP update rules. Output identical to MP at start.
Training
Gradient descent on per-iteration hyperparameters. Structure preserved. Performance fine-tunes.
Robustness
Core update rule from MP provides convergence guarantees. NN layers constrained to MP-like operations.
Flexibility
Per-iteration learnable parameters: damping, weighting, activation. NN expressivity within MP structure.
Trade-off
Less expressive than pure NN. More robust. 1-2 dB better than classical; safer than pure NN.
ex-otfs-ch21-07
MediumFor federated learning with 100 UEs, each training on channel frames, compute the effective training size and number of rounds to convergence.
FedAvg convergence.
Effective size
Total samples: . Comparable to moderate centralized training.
Rounds
FedAvg: rounds for centralized-equivalent convergence. Typical: - rounds.
Bandwidth
Per round: 10 MB × 100 UEs = 1 GB. Total: - GB across rounds. Spread over days-weeks.
Performance
After convergence: ~95% of centralized performance. Accept small gap for privacy.
ex-otfs-ch21-08
MediumAn unfolded MP detector with layers, each with 100 parameters, compares to a pure NN CNN with parameters. How do their training data requirements compare?
Parameter count determines data need.
Unfolded parameters
parameters. Compact.
Pure NN
parameters. 125× more than unfolded.
Data for convergence
Rule of thumb: 10x-100x more data than parameters needed for good generalization. Unfolded: - samples. Pure NN: - samples.
Practical implications
Unfolded: trained on simulation + small real-world fine-tuning. Pure NN: requires large training dataset, federated helps.
ex-otfs-ch21-09
HardProve that an unfolded MP detector with layers matches classical MP with iterations at initialization.
Layer-wise correspondence.
Layer structure
Layer of unfolded MP: update . At initialization: , identical to classical update.
Equivalence
At init: implements classical update rule exactly. matches classical damping. Layer output = classical iteration output.
After $T$ layers
Cascaded: unfolded classical MP with iterations. Performance identical.
Training deviation
Training perturbs to minimize loss on data. Performance improves beyond classical.
ex-otfs-ch21-10
HardDesign an end-to-end training pipeline for learned OTFS pilot + NN detector on a simulated V2X environment.
Joint optimization, differentiable channel.
Data generation
Simulator produces (channel, data) pairs. Parameters: V2X channel profile, fractional Doppler, SNR distribution.
Forward model
where is pilot, channel. Differentiable end-to-end.
NN structure
Estimator . Detector . Parameters: trained jointly.
Loss
Training
Adam optimizer, batch size 64, 10⁴ epochs. Convergence: BER 3 dB better than classical at 15 dB SNR.
Deployment
Export for on-chip use. Pilot stored as 5 DD cell values; detector as NN weights.
ex-otfs-ch21-11
HardFor an NN OTFS receiver deployed in a mobile vehicle, design online adaptation: update NN parameters based on real-time channel measurements.
Fast online adaptation.
Adaptation window
Every 100 frames: collect channel measurements. Compare NN prediction vs classical reference.
Adaptation trigger
If |NN BER - classical BER| > threshold: trigger adaptation.
Adaptation step
Small gradient step on NN parameters via recent data. Learning rate: 10⁻⁴ (small; avoids catastrophic forgetting).
Stability
Limit adaptation to last-layer parameters. Freeze deep layers to maintain pretrained structure.
Computational
Per adaptation: 1000 frames × forward+backward = ~1M ops. At frame rate 100 Hz: 100 ms per adaptation. Feasible.
ex-otfs-ch21-12
HardAnalyze the adversarial robustness gap between pure NN and unfolded MP OTFS detectors under a 0.5 dB perturbation attack.
Attack surface, Lipschitz constant.
Pure NN
Lipschitz constant: large (expressive network). Perturbation of 0.5 dB can shift output significantly. BER under attack: 10× worse than clean.
Unfolded MP
Structure from classical MP: bounded Lipschitz. Perturbation bounded. BER under attack: 2-3× worse than clean.
Gap
Unfolded: 3-5× more robust than pure NN. Important for safety-critical applications.
Adversarial training
Include perturbations in training. Both NN types improve. Unfolded still wins by ~1 dB.
ex-otfs-ch21-13
HardA federated learning system for OTFS NN detector has 500 UEs with non-i.i.d. data (different channel profiles). Describe challenges and mitigations.
Client drift, weighted averaging.
Challenge: client drift
Different UEs have different channel distributions. Local gradients point in different directions. Naive averaging causes NN to zig-zag.
Mitigation: weighted averaging
Weight updates by data size or loss. Prioritize high-quality clients.
Mitigation: personalization
Shared global model + per-UE personalization head. UE-specific adaptation without losing global convergence.
Mitigation: regularization
Proximal term in local objective: . Keeps UEs close to global.
Result
Convergence to a model that works well across most UEs. Each UE fine-tunes locally for its specific channel.
ex-otfs-ch21-14
HardCompare compute complexity of classical MP, unfolded MP, CNN, and transformer-based OTFS detectors at and .
Per-frame ops for each.
Classical MP
ops/frame.
Unfolded MP
Same structure + per-iteration learned parameters. ~1.5x overhead. ops/frame.
CNN
3 conv layers × 32 filters × 9 kernel × : 3 × 32 × 9 × 10⁴ = ops/frame.
Transformer
Self-attention: . ops/frame. 100× slower.
Comparison
MP ≈ unfolded MP < CNN << transformer. Choice: performance vs compute budget. URLLC favors unfolded; best-performance (offline) favors transformer.
ex-otfs-ch21-15
HardDesign a NN OTFS channel estimator that exploits DD-domain sparsity.
Sparse prior, L1 regularization.
Sparse prior
Channel has - paths; DD tensor is 99% zeros.
NN architecture
Input: received DD samples (2 channels for I/Q). CNN: 3 layers, attention over DD. Output: sparse channel tensor.
Loss
( regularization encourages sparsity).
Performance
Compared to OMP: slightly better MSE, same compute. Compared to pure LS: 3 dB better. Benefits from joint training with detector for end-to-end optimization.
Deployment
Pre-trained on simulation, fine-tuned on deployment. Handles fractional Doppler structure automatically.
ex-otfs-ch21-16
HardAssess the standardization prospects for learned pilots in 6G. What are the 3GPP considerations?
Interoperability, testing, IPR.
Interoperability
Learned pilots are UE-capability-specific. Need standardized pilot-template exchange via RRC. Rel. 21 expected.
Testing
3GPP test: demonstrate learned pilots outperform classical across diverse profiles. Requires large simulation campaigns.
IPR
Learned pilot algorithms are somewhat patent-protected (Ma-Wang-Caire + CommIT). FRAND licensing expected.
Backward compat
Legacy UEs: fall back to classical pilots. Rel. 21 UEs: optional learned.
Timeline
Rel. 20 (2026-2028): study item. Rel. 21 (2028-2030): spec. Rel. 22 (2030+): deployment.