Exercises
ex32-01-sim2real-sources
EasyList 5 sources of the sim-to-real gap for a learned OFDM radar imaging system trained on point-scatterer simulation and deployed on a TI IWR6843 in an office. For each, estimate the approximate PSNR degradation (in dB).
Think about model mismatch, hardware, and environment.
Sources and estimates
(1) Model mismatch (Born vs multipath): dB. Multipath creates ghost targets not in the model. (2) Off-grid targets (continuous positions vs grid): dB. Basis mismatch in sparse recovery. (3) Phase noise (oscillator imperfections): dB. Broadens the PSF slightly. (4) Mutual coupling (unmodelled antenna interaction): dB. Distorts the array pattern. (5) Clutter (furniture, walls not modelled as targets): dB. Elevates the noise floor. Total: dB, consistent with the typical 10--15 dB sim-to-real gap.
ex32-02-dynamic-prior
EasyA radar images a scene at 10 fps. Between consecutive frames, one person walks 0.15 m. Which temporal prior (smoothness, sparse innovation, or optical flow) is most appropriate? Justify.
Consider the nature of the change: smooth, sparse, or motion-based?
Analysis
A walking person shifts by 0.15 m/frame. This is a coherent rigid-body motion best modelled by optical flow: with m/s.
Smoothness: inappropriate because the change is spatially localised. Sparse innovation: partially appropriate (few pixels change) but ignores the motion structure. Optical flow: most appropriate because it directly models the translational displacement.
ex32-03-primitive-count
EasyA conference room contains 4 walls, a table (box), 8 chairs (simplified as boxes), a projector screen (plane), and a cylindrical pillar. Compute the total parameter count for a primitive representation and the compression ratio vs a m voxel grid at 5 cm resolution.
Each primitive: 7 parameters (3 position, 3 scale, 1 reflectivity).
Parameter count
Primitives: 4 walls + 1 table + 8 chairs + 1 screen + 1 pillar = 15 primitives. Parameters: .
Voxel count
Voxels: .
Compression ratio
. The primitive representation is nearly four orders of magnitude more compact.
ex32-04-claim-analysis
EasyA paper abstract claims: "We propose DeepRadar, a novel framework that achieves state-of-the-art performance, outperforming all existing methods by 8 dB PSNR on the RadarScenes dataset." Identify 4 questions you would ask before accepting this claim.
Consider baselines, dataset, and statistical rigour.
Questions
(1) What are the "existing methods" compared? Are they fairly tuned, or default parameters? (2) Is the 8 dB consistent across different scenes, or driven by easy cases? Are confidence intervals reported? (3) What is the train/test split? Is there scene overlap? (4) Is RadarScenes a standard benchmark, or curated by the authors? Is it publicly available?
ex32-05-domain-adaptation
MediumYou have a learned SAR imaging network trained on 50,000 simulated scenes and 10 real measured scenes. Design three domain adaptation strategies and predict which achieves the best real-data performance.
Consider fine-tuning, adversarial adaptation, and self-supervised.
Strategy 1: Fine-tuning
Pre-train on 50k simulations. Fine-tune on 10 real scenes with reduced learning rate () for 1,000 iterations. Risk: overfitting. Mitigate: early stopping, weight decay. Expected: 5--7 dB recovery.
Strategy 2: Adversarial
Domain discriminator distinguishing sim from real features. Train imaging network to fool it. Expected: 3--5 dB recovery (unstable with few real examples).
Strategy 3: Self-supervised
Measurement consistency: hold out 20% of measurements per scene, reconstruct from 80%, predict held-out. No labels needed. Expected: 7--10 dB recovery (best).
Prediction
Strategy 3 (self-supervised) likely best because it avoids label bottleneck and adversarial instability. Combining pre-train self-supervised fine-tune may be optimal.
ex32-06-4d-nerf
MediumExtend the RF-NeRF framework to 4D (space + time). Describe: (1) the modified MLP input/output; (2) how to handle time; (3) training data requirements; (4) main challenges vs 3D RF-NeRF.
The simplest approach: condition the MLP on a time code.
MLP modification
Input: where is positional encoding, with frequency bands for time. Alternatively, use a per-frame latent code (auto-decoder).
Time handling
Option A: single MLP conditioned on (smooth changes). Option B: deformation field warping a canonical frame (better for rigid motion).
Data requirements
3D RF-NeRF: viewpoints at one time. 4D: viewpoints at time steps measurement sets. Typical: , 500 CSI snapshots.
Challenges
(1) more data. (2) Temporal aliasing if motion exceeds frame rate. (3) training cost. (4) Ambiguity: new object vs moved object.
ex32-07-cross-modal
MediumA cross-modal foundation model is pre-trained on 1 million paired (optical, RF channel) samples from simulation. Describe how to use it for RF imaging in a new building with no optical images.
The embedding captures scene structure independent of modality.
Approach
(1) Collect RF measurements in the new building. (2) Encode into shared embedding: . (3) Condition decoder on embedding: . The embedding encodes scene-level features learned cross-modally.
Why it works
The embedding captures scene types (open plan, partitioned, few/many scatterers) common across modalities. Even without optical images, the RF embedding provides a useful prior.
Limitation
Novel scene types absent from pre-training may produce misleading embeddings. The model should report uncertainty (distance from nearest training embedding).
ex32-08-pareto
MediumAn ISAC system allocates power between communication () and imaging (). Rate . Imaging PSNR dB. Plot the Pareto frontier for dB by varying .
Evaluate and at several values.
Evaluation
. Key points: : , dB. : , dB. : , dB. : , dB. : , .
Frontier shape
The frontier is convex. The "knee" at --: allocating 10--30% to communication costs only 0.5--3 dB imaging quality while providing 7--9 bits/s/Hz.
ex32-09-scalability
MediumEstimate memory requirements for a 3DGS digital twin of a m campus. Propose a hierarchical scheme fitting within 32 GB GPU memory.
Divide into blocks; only render near the query point.
Naive estimate
Campus surface area m. At 1 Gaussian per 10 m surface: Gaussians. Memory: bytes MB. Actually feasible for storage, but rendering against all is expensive for many queries.
Hierarchical scheme
Divide into blocks of m. Each building block: Gaussians. Active zone (100 m radius): blocks Gaussians MB. Distant zone: path-loss model (no Gaussians). Stream blocks in/out as UEs move. Fits within 32 GB.
ex32-10-assumption-audit
MediumRead the following signal model and list all assumptions (explicit and implicit): "We consider a MIMO radar with transmit and receive antennas observing point targets in the far field. The received signal is , where is the known sensing matrix, is the sparse scene, and ."
Count both stated and implied assumptions.
Explicit assumptions
(1) Point targets. (2) Far field. (3) Known . (4) Sparse scene ( targets). (5) AWGN noise. (6) Known .
Implicit assumptions
(7) : real-valued reflectivity (no phase). (8) On-grid targets. (9) Born approximation (linear model). (10) Stationary scene. (11) Narrowband (single ). (12) Perfect calibration.
ex32-11-baseline-fairness
MediumA paper compares its deep unrolling method to: (1) matched filter, (2) LASSO with , (3) ISTA with 50 iterations. Critique each baseline's fairness. Propose improvements.
Is each baseline given its best chance?
Matched filter
Fair as a lower bound but too weak for meaningful comparison. Keep but add stronger methods.
LASSO with $\lambda = 0.1$
Unfair: is arbitrary. Performance is highly sensitive to . Fix: 5-fold cross-validation.
ISTA with 50 iterations
Potentially unfair: 50 may be insufficient. Fix: run to convergence () or max 500 iterations.
Missing baselines
Add: (4) ADMM (different optimiser), (5) LISTA (learned unrolling), (6) U-Net (learned non-physics). This spans classical to learned.
ex32-12-resolution-chart
HardA learned SAR imaging method claims super-resolution. Analyse: (1) is this physically possible? (2) When does super-resolution become hallucination? (3) How would you test whether it is genuine?
Super-resolution from priors depends on SNR and scene statistics.
Physical possibility
Yes, for sparse scenes. The Rayleigh limit assumes no prior; with point targets, the CRB on separation can be much smaller at sufficient SNR.
Hallucination boundary
Occurs when the prior dominates: (1) low SNR ( dB); (2) scene types not in training; (3) two targets closer than the limit may merge or split incorrectly.
Testing
(1) Resolution chart: vary target separation, plot . (2) Noise sensitivity: genuine degrades with SNR; hallucination does not. (3) OOD test: test on extended targets (not in training). (4) Uncertainty map: high uncertainty near resolution limit prior-dominated.
ex32-13-fisher-information
HardFor a linear array of elements at half-wavelength spacing imaging a 2D scene at 28 GHz with 200 MHz bandwidth, compute the Fisher information matrix and determine the range and cross-range CRB for a target at broadside, range 10 m.
The Fisher information is .
Use the resolution formulae for range and cross-range.
Range CRB
Range resolution: m. The CRB for range estimation of a single target: . At SNR dB (100): m mm.
Cross-range CRB
Array aperture: m. Cross-range resolution: m. CRB: m cm.
Interpretation
Range resolution is limited by bandwidth (0.75 m); cross-range by aperture (1.34 m). The CRBs show that at 20 dB SNR, we can localise a single target to cm accuracy -- much finer than the resolution cell. For multiple targets within one resolution cell, the CRB degrades and requires super-resolution.
ex32-14-primitive-optimisation
HardFormulate the gradient of the data-fidelity loss with respect to the position of a box primitive. Assume the box has half-extents and the forward model uses the Born approximation with far-field steering vectors.
The measurement response of a shifted box involves a phase term .
Differentiate the complex exponential.
Forward model
The measurement response of box at measurement point (wavenumber ): where the sinc arises from the Fourier transform of a rectangular function.
Loss gradient
Loss: where . Residual: . . . Therefore: .
ex32-15-imaging-capacity
HardDerive the imaging capacity for a MIMO radar with measurements imaging an -voxel scene. Show that when has rank , the imaging capacity is where are the singular values.
Use the mutual information formula for Gaussian channels.
Mutual information
Model: with and . .
Covariance
. Using SVD : .
Capacity
where . This is identical to the MIMO capacity formula, confirming the deep connection.
ex32-16-uncertainty
HardDesign an uncertainty quantification method for a learned RF imaging system deployed in an autonomous vehicle. Provide: (1) per-pixel confidence; (2) overall reliability score; (3) failure detection mechanism.
Consider MC dropout and conformal prediction.
Per-pixel confidence
MC Dropout: run the network times with random dropout at inference. Per-pixel variance estimates epistemic uncertainty. Overhead: inference ( ms).
Reliability score
. If , the reconstruction is unreliable. Threshold calibrated on validation set.
Failure detection
Conformal prediction: conformity score . If (calibrated quantile), flag as failure. Distribution-free guarantee: failure rate . Total overhead: ms, acceptable for 10 Hz radar.
ex32-18-ablation-design
ChallengeA paper proposes "RF-ResNet" for through-wall radar imaging, combining: (A) physics-informed residual network, (B) wall-clutter removal, (C) complex-valued convolutions, (D) frequency-aware positional encoding. Design a minimal ablation study with 6 variants.
Include single-component removals and key replacements.
Ablation table
| # | Variant | A | B | C | D |
|---|---|---|---|---|---|
| 1 | Full RF-ResNet | Y | Y | Y | Y |
| 2 | w/o wall removal (B) | Y | N | Y | Y |
| 3 | w/o complex conv (Creal) | Y | Y | N | Y |
| 4 | w/o freq encoding (D) | Y | Y | Y | N |
| 5 | w/o physics (Aplain ResNet) | N | Y | Y | Y |
| 6 | Minimal (plain ResNet, no B/D) | N | N | Y | N |
Expected insights
Comparing 1 vs 2: if wall removal contributes dB, pre-processing dominates. 1 vs 3: complex convolutions capture phase (expected for coherent imaging). 1 vs 5: value of physics-informed architecture. Variant 6: lower bound.
ex32-19-literature-survey
ChallengeYou are writing the related work section for a paper on learned indoor imaging from Wi-Fi signals. Identify the 4 research communities whose work you should cite, list 2 key papers from each, and explain how each community's perspective differs.
RF imaging draws from signal processing, ML, wireless, and computational imaging.
Communities
(1) Signal processing: resolution limits, sparse recovery. Papers: Candes et al. (2014, super-resolution); Potter et al. (2010, CS-SAR). Perspective: guarantees, worst-case. (2) Machine learning: architectures, generalisation. Papers: Monga et al. (2021, unrolling); Ongie et al. (2020, deep inverse). Perspective: data-driven, empirical. (3) Wireless communications: channel models, ISAC. Papers: Liu et al. (2022, ISAC survey); Caire (2026, illumination model). Perspective: system-level, standards. (4) Computational imaging: physics-based reconstruction. Papers: Mildenhall et al. (2021, NeRF); Tulsiani et al. (2017, primitives). Perspective: novel view synthesis.
Why all matter
Missing any community risks reinventing techniques, wrong baselines, or ignoring constraints. Reviewers at TSP/JSAC span all four communities.
ex32-20-research-proposal
ChallengeWrite a 1-page research proposal for a 3-year PhD project on one open problem from this chapter. Include: (1) problem statement; (2) three research questions; (3) proposed approach; (4) expected contributions; (5) timeline with milestones.
Choose a focused problem with clear deliverables.
Example: Primitive-Based RF Scene Reconstruction
Problem: Current RF imaging methods use millions of parameters (voxels, neural fields) to represent scenes with inherently low-dimensional geometric structure, leading to high computational cost and poor physical interpretability.
Research questions: (Q1) Can indoor RF scenes be decomposed into geometric primitives with reconstruction quality matching voxel-based methods? (Q2) What is the optimal primitive dictionary (box, cylinder, superquadric) for indoor RF imaging at sub-6 GHz and mmWave? (Q3) Can BIM-initialised primitive representations improve reconstruction accuracy and reduce the data requirement?
Approach: Year 1: Differentiable primitive rendering for RF. Greedy decomposition algorithm. Validation on 2D simulated scenes. Year 2: Extension to 3D. Neural primitive predictor from matched-filter images. Real-data validation on 5 rooms. Year 3: BIM integration. Multi-frequency primitive sharing. Campus-scale demonstration.
Expected contributions: (1) First primitive-based reconstruction for RF imaging with compression. (2) Differentiable RF primitive rendering library (open source). (3) BIM-integrated digital twin pipeline. (4) Dataset: 20 rooms with ground truth + BIM.
Timeline: M1-M6: literature, 2D implementation. M7-M12: first paper (2D results). M13-M18: 3D extension. M19-M24: real data, second paper. M25-M30: BIM integration. M31-M36: thesis, third paper.