Exercises

ex-ch22-01

Easy

Compute the support [(1γ)2,(1+γ)2][(1-\sqrt\gamma)^2,(1+\sqrt\gamma)^2] of the Marchenko–Pastur distribution for γ{0.1,0.25,0.5,0.9}\gamma\in\{0.1,0.25,0.5,0.9\}.

ex-ch22-02

Easy

Derive the ridge estimator x^ridge=(ATA+λI)1ATy\hat{\mathbf{x}}_{\text{ridge}}=(\mathbf{A}^T\mathbf{A}+\lambda\mathbf{I})^{-1}\mathbf{A}^T\mathbf{y} by differentiating the ridge objective and setting the gradient to zero.

ex-ch22-03

Easy

Compute the soft-thresholding operator ηθ(z)\eta_\theta(z) for θ=0.5\theta=0.5 and z{1.2,0.3,0,0.8,2.0}z\in\{-1.2,-0.3,0,0.8,2.0\}.

ex-ch22-04

Easy

For N=5N=5, σ2=1\sigma^2=1, compute the shrinkage factor 1(N2)σ2/y21-(N-2)\sigma^2/\|\mathbf{y}\|^2 of the James–Stein estimator when y2=10\|\mathbf{y}\|^2=10. What happens when y2=2\|\mathbf{y}\|^2=2?

ex-ch22-05

Easy

State what "inadmissible" means in two sentences, without using formulas.

ex-ch22-06

Medium

Show that the LMMSE estimator for the model y=Ax+w\mathbf{y}=\mathbf{A}\mathbf{x}+\mathbf{w} with xN(0,σx2I)\mathbf{x}\sim\mathcal{N}(\mathbf{0},\sigma_x^2\mathbf{I}), wN(0,σ2I)\mathbf{w}\sim\mathcal{N}(\mathbf{0},\sigma^2\mathbf{I}) coincides with the ridge estimator at λ=σ2/σx2\lambda=\sigma^2/\sigma_x^2.

ex-ch22-07

Medium

Derive the KKT conditions for the LASSO and interpret them coordinate-by-coordinate.

ex-ch22-08

Medium

Compute the divergence (y/y2)\nabla\cdot(\mathbf{y}/\|\mathbf{y}\|^2) and verify that it equals (N2)/y2(N-2)/\|\mathbf{y}\|^2 for y0\mathbf{y}\neq\mathbf{0}.

ex-ch22-09

Medium

Given yN(θ,σ2IN)\mathbf{y}\sim\mathcal{N}(\boldsymbol{\theta},\sigma^2\mathbf{I}_N) and the positive-part JS estimator θ^JS+=max(0,1(N2)σ2/y2)y\hat{\boldsymbol{\theta}}_{\text{JS}+}=\max(0,1-(N-2)\sigma^2/\|\mathbf{y}\|^2)\mathbf{y}, argue (informally) why it should dominate the ordinary JS estimator.

ex-ch22-10

Medium

Prove that the Bayes estimator under squared-error loss is the posterior mean θ^Bayes(y)=E[θY=y]\hat\theta_{\text{Bayes}}(y)=\mathbb{E}[\theta\mid Y=y].

ex-ch22-11

Medium

Verify the minimax = maximin duality for the scalar Gaussian mean problem YN(θ,1)Y\sim\mathcal{N}(\theta,1), θR\theta\in\mathbb{R}. Specifically, compute both sides and check equality.

ex-ch22-12

Medium

Derive the optimal linear shrinkage for the sample covariance matrix: Σ^α=(1α)Σ^+αI\hat{\boldsymbol{\Sigma}}_{\alpha}=(1-\alpha)\hat{\boldsymbol{\Sigma}}+\alpha\mathbf{I}. Find the α\alpha^* that minimises EΣ^αΣF2\mathbb{E}\|\hat{\boldsymbol{\Sigma}}_\alpha-\boldsymbol{\Sigma}\|_F^2.

ex-ch22-13

Hard

Using the Marchenko–Pastur law, show that 1NEtr(1MATA+λI)1\frac{1}{N}\mathbb{E}\mathrm{tr}(\tfrac1M\mathbf{A}^T\mathbf{A}+\lambda\mathbf{I})^{-1} converges to the Stieltjes transform m(λ)=1μ+λfγ(μ)dμm(-\lambda)=\int\tfrac{1}{\mu+\lambda}f_\gamma(\mu)\,d\mu, and derive a closed-form expression for m(λ)m(-\lambda) as a function of γ\gamma and λ\lambda.

ex-ch22-14

Hard

Prove the upper bound of TMinimax Rate for Sparse Estimation: best-subset selection achieves MSE σ2slog(N/s)/M\lesssim\sigma^2 s\log(N/s)/M.

ex-ch22-15

Hard

Derive the risk of the James–Stein estimator when shrinking toward an arbitrary fixed vector μ0\boldsymbol{\mu}_0 rather than zero. Conclude that the JS phenomenon is independent of the anchor.

ex-ch22-16

Hard

Show that LASSO with penalty λ=2σ2logN/M\lambda=2\sigma\sqrt{2\log N/M} satisfies the minimax rate for sparse recovery. (Upper bound only.)

ex-ch22-17

Challenge

Derive the fixed-point equation for the asymptotic MSE of LASSO in the proportional regime (Bayati–Montanari state evolution). Reproduce the statement: there exist τ\tau^* and α\alpha^* such that the LASSO MSE equals τ2σ2\tau^{*2}-\sigma^2 at the fixed point.

ex-ch22-18

Challenge

For a massive-MIMO uplink with Nt=128N_t=128 antennas, K=8K=8 single-antenna users, and MM pilot symbols, design a shrinkage-based channel estimator that minimises the worst-case MSE over users with bounded channel norm hkB\|\mathbf{h}_k\|\leq B. Compare with LMMSE assuming a mismatched prior.