Exercises
ex-ch24-01
EasyCompute the prior information for each of the following priors on : (a) Gaussian ; (b) Laplace ; (c) Cauchy . For each, check the Van-Trees boundary condition at and comment on any subtlety.
Use where the prime means squared times .
For Laplace, so the score is piecewise constant.
For Cauchy, compute and integrate.
Gaussian prior
, so and . The prior vanishes smoothly at infinity, so Van Trees applies directly.
Laplace prior
, so and . The density has a kink at but is continuous and vanishes at , so the integration-by-parts boundary condition holds.
Cauchy prior
, so The Cauchy vanishes at (polynomially), so Van Trees still applies. Note that yet is finite — the prior information does not require finite prior variance.
ex-ch24-02
EasyIn the Gaussian location model with and i.i.d. observations with , the Van-Trees bound is . Show that the "effective sample size" of the Bayesian experiment equals the classical sample size plus a constant term, and identify that constant.
Write in the form and solve for .
The prior-precision term is , which has units of inverse variance — it adds to in .
Rewrite the bound
. Factor out :
Interpretation
The additive constant is the ratio of the noise variance to the prior variance. When the prior is as sharp as a single noisy measurement (), it counts like one extra observation. When the prior is times sharper, it counts like extra observations. This is exactly the sense in which "a tight prior is worth samples."
ex-ch24-03
EasyConsider the translation-invariant time-of-arrival problem with uniform prior on and binary-error probability for a constant (the effective per-unit-lag SNR). Using the uniform-prior form of the Ziv-Zakai bound, , argue that valley-filling has no effect here, and write the bound as a single integral.
Is monotonic in ?
If the integrand is already non-increasing, is the identity.
Monotonicity of $P_{\min}$
The Q-function for is strictly decreasing in . Since is increasing in , is strictly decreasing in . Therefore .
Resulting integral
$ This matches the Bellini form for a Gaussian-shift problem with no autocorrelation sidelobes — which is expected, since a pure translation model has no ambiguity function structure other than its main lobe.
ex-ch24-04
EasyVerify the I-MMSE identity for the Gaussian input on the channel , . That is, compute , compute , and confirm .
Use (nats) for Gaussian input.
The MMSE estimator of given for jointly Gaussian variables is linear; its error variance is .
Mutual information
For Gaussian input on a Gaussian channel, nats.
MMSE
The MMSE estimator is with error variance .
Verification
. The identity holds exactly.
ex-ch24-05
EasyFor the narrowband ISAC angle-estimation model with a uniform linear array of elements, unit transmit covariance , snapshots, complex reflectivity and per-antenna noise variance , the angle CRB (in radians squared) reads — up to a constant. Using the standard steering-vector derivative for a half-wavelength ULA, show that in the large-array limit.
The steering vector is .
Compute and use .
Steering-vector derivative
.
Squared norm
for large .
CRB scaling
With , , and . Plugging in: The cubic gain is the classical super-resolution scaling of ULAs.
ex-ch24-06
MediumA uniform prior on violates the boundary condition of the Van Trees inequality. Consider the smoothed approximation where and normalises. Compute for this smoothed prior and show it diverges as . Interpret what this means for the Van Trees bound on a compactly-supported prior.
Outside the prior is Gaussian-like with variance .
The score is nonzero only for and grows as .
Compute and track its -scaling.
Score of the smoothed prior
For , . For , the score is zero. So .
Asymptotic evaluation
Substitute in the right tail. The tail probability mass is (boundary-layer width), and within that layer , so as .
Interpretation
The prior information diverges as the prior becomes sharper at the boundary, which drives to zero and the Van Trees bound collapses. This is the mathematical signature of an ill-posed boundary condition: the smooth Van-Trees bound is not the right tool for hard-support priors, and one must instead use the Gill-Levit constrained version or the Ziv-Zakai bound, which handle bounded supports natively.
ex-ch24-07
MediumFor a vector parameter with prior and likelihood , derive the matrix form of the Van Trees inequality: where . Outline the Cauchy-Schwarz step that produces the PSD inequality.
Start with the joint score .
Show via vector integration by parts.
Apply the matrix Cauchy-Schwarz: for zero-mean vectors , .
Joint score
Let . The joint score is . The two terms are uncorrelated (conditional score has zero mean given ), so .
Identity via integration by parts
For each pair of coordinates , , since does not depend on . The boundary term from integration by parts vanishes by the assumed decay of . Stacking: .
Matrix Cauchy-Schwarz
With and , the matrix Cauchy-Schwarz inequality gives Equality holds iff and are (matrix-) proportional a.s.
ex-ch24-08
MediumConsider TOA estimation with a pulse of RMS bandwidth and energy , noise PSD , and uniform prior on . The Q-function argument at small is . Show that as , the uniform-prior ZZB converges to the CRLB .
At high SNR, is sharply peaked at small , so the factor is and the integration range extends to .
Change variables and use .
High-SNR simplification
At high SNR, is supported on , so and we can extend the upper limit to : .
Change of variables
Let , so and . Hence
Evaluating the integral
Integration by parts gives . Therefore which is exactly the CRLB for TOA estimation. The ZZB reduces to the CRLB in the high-SNR regime, as required.
ex-ch24-09
MediumSketch the MMSE curve for BPSK input ( equiprobable) and derive its limits at and . Using I-MMSE, express the BPSK capacity as an integral of the MMSE.
The MMSE at equals ; at it equals .
The posterior mean is ; find at the limits.
Low-SNR limit
. Near , expansion gives .
High-SNR limit
As , almost surely, so exponentially fast. Specifically as by the tail of .
Capacity via I-MMSE
By the I-MMSE identity, . Numerically integrating the sigmoidal MMSE curve produces the BPSK capacity, saturating at nats. The sharp sigmoid transition in at corresponds to the steep rise of from zero to its saturation.
ex-ch24-10
MediumA sparse Bernoulli-Gaussian input has with and independent (so ). Compute . Argue heuristically why has an L-shape: a long plateau at low SNR followed by a sharp drop.
At the MMSE equals the variance of — use the scaling chosen.
The detector must first "detect" whether before estimating ; this is a phase transition in .
MMSE at zero SNR
.
L-shape heuristic
When is small, the posterior is flat across noise realisations and the estimator essentially guesses , giving MMSE near . Only when does the likelihood ratio for exceed the noise floor — at that point, becomes reliably detectable and the posterior concentrates on the correct support. The MMSE then drops rapidly as the conditional Gaussian estimate of takes over. The plateau width scales like and the drop steepness like , giving the characteristic L-shape of sparse inputs.
Connection to compressed sensing
This L-shape is the single-letter version of the phase transition observed in compressed-sensing reconstruction: at measurement density below a critical threshold, the MSE is stuck at the prior variance; above threshold, it drops to a denoising-limited floor. The I-MMSE integral converts this shape into a rate-versus-SNR curve with a matching phase transition.
ex-ch24-11
MediumA monostatic ISAC transmitter with antennas transmits a waveform with sample covariance , . For a single target at angle , the angular Fisher information scales like . For a single-user communication channel , the rate is . Pose the rate-CRB Pareto optimisation as a Lagrangian in . Identify the rank-1 extremes.
Maximise subject to the power constraint.
At , this is pure waterfilling toward — rank 1 along .
At , it aligns along — rank 1 along the angle-gradient.
Lagrangian
Maximise over .
Communication extreme
At , depends only on . By the rank-1 optimality of power-constrained quadratic forms, . The transmitter puts all its power in the user's direction.
Sensing extreme
At , only matters. Again rank-1 is optimal: . Power is steered along the angle-gradient direction, which is orthogonal to the main-lobe direction and maximises the slope of the beampattern (optimal for angular discrimination).
Interior of the tradeoff
For , the optimal has rank up to (a mixture of the two rank-1 solutions). Scaling from to traces out the Pareto boundary, morphing smoothly from the communication beam toward the sensing beam.
ex-ch24-12
MediumShow that for a scalar Gaussian channel with having variance (arbitrary distribution), — the MMSE of any input is at most the Gaussian MMSE at the same power. Conclude that and identify this as the Gaussian-input upper bound.
The LMMSE estimator of from gives an upper bound on the conditional-mean MMSE, since the LMMSE is suboptimal in general.
Integrate the MMSE bound over and use I-MMSE to get an upper bound on .
LMMSE as upper bound
The linear MMSE estimator achieves error variance for any input with variance . The posterior-mean (MMSE) estimator is optimal, so its error variance is at most the LMMSE error variance: .
Integrate via I-MMSE
.
Gaussian-input is the maximiser
Equality holds iff for all , which by the uniqueness of the LMMSE implies is jointly Gaussian with , hence itself is Gaussian. This is the I-MMSE proof of the entropy-power inequality statement that Gaussian inputs maximise mutual information at a power constraint — a slick alternative to the standard Shannon-Gelfand-Yaglom derivation.
ex-ch24-13
MediumIn a single-target ISAC setting, the FIM for has a specific block structure: diagonal entries scale with the squared RMS bandwidth, squared array aperture, and squared coherent duration respectively, while cross-terms arise from waveform time-frequency-space coupling. Describe qualitatively (one sentence each) how each of the following waveform choices affects the three CRBs and their cross-terms: (a) narrowband single-tone (no bandwidth), (b) random OFDM symbol (spread bandwidth, random symbols), (c) up-chirp LFM (deterministic time-frequency coupling).
Delay precision needs bandwidth; Doppler needs duration; angle needs array aperture.
Cross-terms reflect how the waveform structure couples two or more parameters.
Narrowband single-tone
No bandwidth means zero delay information: the delay diagonal of the FIM is zero (infinite delay CRB) and delay-angle / delay-Doppler cross-terms vanish. Angle and Doppler precision remain fine. This is the classic "Doppler-only" radar mode.
Random OFDM symbol
Spread bandwidth gives good delay precision; random symbols decorrelate the waveform across subcarriers, killing cross-terms (the FIM becomes nearly diagonal). This is the ideal for joint estimation: each parameter is decoupled from the others and the CRBs on all three are small.
Up-chirp LFM
Good delay and Doppler precision (wideband, long-duration), but the linear time-frequency coupling produces a strong delay-Doppler cross-term: a shift in produces the same first-order change in the template as a shift in , so the two are ambiguous. Marginal CRBs look fine; joint CRBs (on linear combinations of and ) are excellent only along one diagonal of the plane and poor along the other. This is the classic LFM range-Doppler coupling.
ex-ch24-14
HardConsider the phase estimation problem , , with (wrapped — treat the support as the circle). The CRLB at is , which diverges at . Use the Ziv-Zakai bound (in its translation form on the circle) to produce a finite bound that does not blow up at these points.
The binary error is — but on the circle, the relevant distance is itself.
Marginalise over the uniform circular prior; the factor is simply for .
The valley-filled binary error is bounded by , so the ZZB is capped at the circular variance .
Binary error on the circle
For a uniform and observation , the expected binary error between and under the Bayes detector is . After averaging over uniform , this is a deterministic function of .
ZZB integrand
With the uniform-prior form, (using the half-interval because of -symmetry). Valley-filling makes the integrand non-increasing; at small where is flat (near ) the raw is near , and valley-filling propagates this upward.
Bound is finite
Because for all , the ZZB is bounded by . The CRLB singularity at reflects that a pointwise estimator is hopeless near these angles (a tiny jitter wraps around), but the averaged ZZB is bounded because the prior averages over the bad regions. This is precisely why ZZB is the standard bound for circular / phase / angle estimation with singular CRLBs.
Engineering takeaway
Any estimator of a circular parameter that quotes only the CRLB is misleading at the singular points. Radar and positioning engineers dealing with angle-of-arrival near endfire () always use ZZB for the honest accuracy bound.
ex-ch24-15
HardUsing I-MMSE, prove the MMSE monotonicity property: for the Gaussian channel , is a strictly decreasing, convex function of for any non-degenerate input .
Write .
By the data-processing inequality, is non-decreasing in ; this gives monotonicity of indirectly. But to show strict decrease and convexity, examine directly using Stein-type identities.
Guo-Shamai-Verdu show with strict inequality unless is degenerate.
Monotonicity
By I-MMSE, . The mutual information is non-decreasing in (more SNR cannot remove information), and strictly increasing whenever is non-degenerate. Hence and for all . This is consistent but does not directly show monotonicity of .
Derivative identity
Guo, Shamai and Verdu (2005, Theorem 2) prove the higher-order identity The right-hand side is non-positive, and zero iff almost surely, which requires to be a deterministic function of — impossible for non-degenerate and finite . Hence : is strictly decreasing.
Convexity
A similar computation (Guo-Shamai-Verdu, Theorem 3) gives with the bracket non-negative. Therefore is convex in .
Geometric interpretation
is the slope of . Convexity of says that is concave and strictly so — consistent with the well-known concavity of mutual information in SNR for any fixed input distribution. I-MMSE gives a clean, estimation-theoretic proof of a long-standing information-theoretic property.
ex-ch24-16
HardConsider joint TOA-AOA estimation with a wideband waveform over a ULA. The parameter is with diagonal Fisher information and . Assuming a uniform prior on , derive the vector Van Trees bound and compare to the trace of the inverse Fisher. Discuss when the prior term dominates.
Compute for a box-like smoothed prior (or treat it as the sum of two scalar prior informations).
For a Gaussian-smoothed box of half-width , when the box is tight enough.
Compare with .
Assembling the Bayesian information
With a separable prior on approximated by independent Gaussians of variance , . The data-averaged Fisher information is (assuming negligible cross-terms for a wideband uncorrelated waveform). Thus
Trace of the inverse
\mathrm{tr}(\mathbf{I}F^{-1}) = 1/I\tau + 1/I_\theta$, with the gap shrinking as priors widen.
Prior-dominated regime
The prior dominates on coordinate when , i.e., when the prior precision exceeds the data Fisher information. For a tight delay prior of half-width (say, a previous ranging fix accurate to one tenth of the pulse width), the delay Van-Trees bound is dominated by the prior whenever , i.e., at low SNR. This is the coasting regime in which the tracker is "believing the prior more than the data."
Engineering interpretation
For 5G NR positioning with large bandwidth and small array, the delay dimension enters the prior-dominated regime long before the angle dimension. This explains why network-based tracking (with history-informed priors) beats single-shot CRLB predictions by large factors at low SINR.
ex-ch24-17
HardA deterministic waveform with sample covariance (communication-optimal, rank-1) gives a rate but zero sensing information along whenever . Propose a rank-2 transmit covariance that achieves (i) the same rate and (ii) non-zero angular Fisher information. What is the price?
To preserve the rate, the projection of onto must still carry full power along .
A rank-2 with trades rate for sensing.
To achieve the same rate, one needs a rank-2 that preserves — impossible in general if .
The constraint analysis
Preserving rate requires . Power constraint requires . Cauchy-Schwarz gives , with equality iff is rank-1 aligned with . So achieving forces rank-1 along — one cannot get extra sensing for free.
The rank-2 tradeoff
One must accept a rate loss. Let with . Then (assuming orthogonality) and angular Fisher is proportional to .
The price
A fractional rate loss of buys angular Fisher proportional to . For small sensing allocations ( near 1), with . At high SNR, even a small buys significant sensing while costing little rate — the standard "free lunch at the Pareto frontier" result.
Connection to ISAC design
Commercial ISAC systems typically set in high-SNR regimes: keep most power in the data direction, allocate 10-30% to the sensing direction, and accept a ~0.5 bit/symbol rate loss for non-degenerate target estimation.
ex-ch24-18
HardConsider a toy ISAC capacity-distortion problem in the style of Kobayashi-Caire-Kramer: a memoryless DMC with state , channel , and the transmitter observing causal feedback to estimate . The achievable rate-distortion region is . Explain why the Pareto tradeoff between rate and distortion in this joint formulation can differ from a naive CRB-rate tradeoff, and give an intuition for when the two coincide.
In the joint formulation, is optimised for both communication and estimation; in the CRB-rate tradeoff, only the marginal distribution of matters for sensing.
The joint bound is tight (I(X;Y|S) exploits knowing S); the CRB is a local, pointwise bound that ignores distribution-level structure.
They coincide when the distortion is squared-error, the channel is Gaussian, and the prior on S is matched to the CRB regime.
Information-theoretic vs. estimation-theoretic view
The Kobayashi-Caire-Kramer formulation maximises the joint information-distortion tradeoff: the input distribution controls both the rate and the achievable distortion . Both depend on the full distribution of , not just its second moment. In contrast, the Fisher-information-based rate-CRB tradeoff depends on the input only through its sample covariance .
Where they differ
Any non-Gaussian input achieves the same CRB as a Gaussian with the same covariance (CRB is a second-order functional), but different non-Gaussian inputs achieve different and different . So the CRB-rate frontier is strictly below the full information-distortion frontier whenever non-Gaussian inputs can do better — for instance, when the channel benefits from discrete modulation or when the distortion function is non-MSE.
Where they coincide
They coincide asymptotically when: (i) the distortion is squared error, (ii) the channel is Gaussian with Gaussian state, (iii) the SNR is high enough that Gaussian input is near-optimal for communication, and (iv) the state prior is broad enough that the CRB is tight (no ambiguity regime). Under these conditions, both frameworks reduce to the same quadratic Pareto tradeoff.
Research takeaway
CommIT (Kobayashi-Caire-Kramer 2018, Xiong-Liu-Cui-Yuan-Han-Caire 2023) use the information-theoretic formulation to characterise the fundamental capacity-distortion region for ISAC. The CRB-rate region is a second-order surrogate that is correct in the high-SNR Gaussian regime and misleading outside it. Research aimed at discrete-modulation ISAC designs or non-MSE distortion must use the joint bound.