Exercises

ex-mimo-ch25-01

Easy

Under what conditions on the channel model and the noise model is the linear MMSE channel estimator provably Bayes-optimal? Give the precise statement and identify the two assumptions that must hold.

ex-mimo-ch25-02

Easy

Compute the Gaussian rate-distortion function R(D)R(D) for a 64-dimensional channel with eigenvalues Ξ»i=2βˆ’(iβˆ’1)\lambda_i = 2^{-(i-1)} for i=1,…,64i = 1, \ldots, 64 (geometric decay) at distortion target D=0.1βˆ‘iΞ»iD = 0.1 \sum_i \lambda_i. How many bits per channel instance are required?

ex-mimo-ch25-03

Medium

Explain why CsiNet preprocesses the channel by taking the IFFT to the delay domain before applying convolutional layers. Identify the physical prior being injected, and describe what happens to NMSE if the IFFT preprocessing is removed.

ex-mimo-ch25-04

Medium

A 64-beam mmWave BS uses an LSTM beam predictor trained on vehicular-speed traces. In a deployment where half the UEs are vehicular and half are pedestrian, the deployed network reaches 90 % top-5 accuracy on vehicular UEs and only 60 % on pedestrians. Diagnose the failure and propose a fix that stays within the data-driven paradigm.

ex-mimo-ch25-05

Medium

Describe the reward gaming failure mode in the "maximize sum-rate" formulation of RL power control and propose three different reward modifications that prevent it. Rank the modifications by how much they sacrifice peak efficiency versus fairness.

ex-mimo-ch25-06

Medium

Unroll 5 iterations of ISTA for sparse channel recovery. Write the explicit layer-by-layer forward pass of the deep-unfolded network and identify the trainable parameters at each layer. How many trainable parameters are there in total for a 32-subcarrier problem?

ex-mimo-ch25-07

Hard

Prove that the PPO clipped surrogate objective reduces to the vanilla policy gradient when the clip parameter Ο΅\epsilon tends to infinity. Comment on why this limit is not used in practice.

ex-mimo-ch25-08

Hard

Show formally that a deep-unfolded KK-layer network, initialized at the classical ISTA parameter values, reproduces KK iterations of classical ISTA exactly. Conclude that the trained network's performance is never worse than KK steps of ISTA, provided the training loss is non-increasing in the layer parameters.

ex-mimo-ch25-09

Hard

Assume the optimal CSI feedback codec operates exactly on the Gaussian rate-distortion curve R(D)R(D) of Theorem 25.1. For a channel covariance with eigenvalues decaying as Ξ»i∝iβˆ’Ξ±\lambda_i \propto i^{-\alpha}, derive how R(D)R(D) scales with DD in the regime where only a vanishing fraction of eigenvalues lies above the water level.

ex-mimo-ch25-10

Medium

Explain the concept of "hybrid deployment with safe fallback" as advocated by the 6G@TU Berlin / Huawei workshop. Design a concrete fallback trigger for a deep-unfolded channel estimator: what measurable quantity would you threshold, and what is the threshold value?

ex-mimo-ch25-11

Easy

List three physical-layer tasks where pure data-driven deep learning is a poor choice and three where it is a reasonable default. Justify each briefly.

ex-mimo-ch25-12

Challenge

A CsiNet codec achieves NMSE =βˆ’18= -18 dB at 128 bits on its training distribution (CDL-C). On an unseen CDL-D deployment it degrades to βˆ’11-11 dB at the same bit budget. A deep-unfolded OAMP-Net with 12 trainable parameters achieves βˆ’15-15 dB on CDL-C and βˆ’14-14 dB on CDL-D at 128 bits. Which architecture is preferable for a commercial deployment seeing a mix of CDL-A/B/C/D scenarios, and why? Quantify the expected worst-case NMSE under each choice.

ex-mimo-ch25-13

Medium

Derive the expected overhead saving of an LSTM beam predictor vs exhaustive beam sweep, as a function of the codebook size ∣B∣|\mathcal{B}| and the top-kk accuracy pkp_k. Assume that when the true beam is not in the top-kk a fallback to exhaustive search is triggered.

ex-mimo-ch25-14

Medium

In the context of wireless RL, explain the simulation-to-reality gap and list three concrete reasons why a policy trained on a 3GPP system-level simulator may fail on a real gNB. Propose one mitigation for each reason.

ex-mimo-ch25-15

Challenge

Consider a Transformer-based CSI encoder with attention across all subcarriers. Compute the number of multiply-accumulate operations per CSI update for Nf=64N_f = 64 subcarriers, Nt=64N_t = 64 antennas, embedding dimension d=128d = 128, and one attention layer. Compare with a CsiNet (convolutional) encoder on the same problem and identify which is deployable on a handset NPU with a 1 GMAC/s budget.