Ferkans — Interactive Telecom Tutor

ex-ris-ch06-01

Easy

Show that the unit-modulus QCQP $\max_{|\tilde{\phi}_n| = 1} \boldsymbol{\tilde{\phi}}^H \mathbf{A} \boldsymbol{\tilde{\phi}}$ with rank-1 PSD $\mathbf{A} = \mathbf{c}\mathbf{c}^H$ has the closed-form solution $\tilde{\phi}_n^\star = e^{j\arg c_n}$ and objective value $\|\mathbf{c}\|_1^2$ .

Show Hint

Use the triangle inequality.

Solution

Bound

$\boldsymbol{\tilde{\phi}}^H \mathbf{A} \boldsymbol{\tilde{\phi}} = |\mathbf{c}^H \boldsymbol{\tilde{\phi}}|^2 = |\sum_n c_n^* \tilde{\phi}_n|^2 \leq (\sum_n |c_n|)^2 = \|\mathbf{c}\|_1^2$ .

Equality

Holds iff all $c_n^* \tilde{\phi}_n$ have the same phase. Pick $\tilde{\phi}_n^\star = e^{j\arg c_n}$ : $c_n^* \tilde{\phi}_n^\star = |c_n| \geq 0$ , all real positive. $\blacksquare$

ex-ris-ch06-02

Medium

Derive the SDR relaxation of the unit-modulus QCQP. Write both the primal SDP and its dual.

Show Hint

Lift $\boldsymbol{\tilde{\phi}}$ to $\mathbf{X} = \boldsymbol{\tilde{\phi}}\boldsymbol{\tilde{\phi}}^H$ .

Solution

Primal

$\max_{\mathbf{X} \succeq 0} \text{tr}(\mathbf{A}\mathbf{X})$ s.t. $[\mathbf{X}]_{nn} = 1$ for all $n$ .

Dual

$\min_{\boldsymbol{\lambda}} \sum_n \lambda_n$ s.t. $\text{diag}(\boldsymbol{\lambda}) - \mathbf{A} \succeq 0$ . (Lagrange multipliers $\lambda_n \in \mathbb{R}$ for the diagonal-equality constraints.)

Strong duality

Both primal and dual are bounded convex SDPs with interior solutions (Slater's condition), so strong duality holds and the primal and dual optima coincide.

ex-ris-ch06-03

Medium

Compute the $(\pi/4)$ -approximation factor by computing $\mathbb{E}[|Z|^2 / \|Z\|^2]$ for $Z \sim \mathcal{CN}(\mathbf{0}, \mathbf{x}\mathbf{x}^H)$ where $\mathbf{x}$ is unit-modulus, $N = 1$ .

Show Hint

$Z = \sigma e^{j\theta}$ with $\sigma \sim \text{Rayleigh}$ and $\theta$ uniform.

Solution

Distribution of $Z$

For $\mathbf{X} = xx^* = 1$ (scalar case), $Z \sim \mathcal{CN}(0, 1)$ . $|Z|^2 \sim \text{Exp}(1)$ .

Approximation ratio

In the scalar case, the unit-modulus projection is $z / |z| = e^{j\arg z}$ , with objective $|x \cdot e^{j\arg z}|^2 = 1$ . The SDR bound is also 1. So the ratio is 1. The $\pi/4$ factor applies to higher-dimensional cases where projection loses correlation; the scalar case is tight.

Reference

See Zhang and Huang (2006) for the general derivation on the complex sphere. $\blacksquare$

ex-ris-ch06-04

Medium

Compute the SDR complexity for $N = 100$ , $N = 500$ , $N = 2000$ . Which are feasible on a modern desktop ( $10^{10}$ flops/s)?

Show Hint

Interior-point: $O(N^{6.5})$ .

Solution

Compute

$N = 100$ : $100^{6.5} = 10^{13}$ . $\sim 10^3$ s ≈ 17 min. Feasible offline. $N = 500$ : $500^{6.5} \approx 10^{17.55}$ . $\sim 10^{7.5}$ s ≈ 6 months. Infeasible. $N = 2000$ : $\sim 10^{21.5}$ . $\sim 10^{11}$ s ≈ 3000 years. Infeasible.

Interpretation

SDR's scaling is a hard wall. Above $N \sim 128$ , switch to manifold or element-wise. For research at $N \sim 100$ , SDR is usable but slow; for deployment at $N \geq 256$ , off-limits. $\blacksquare$

ex-ris-ch06-05

Medium

Compute the Riemannian gradient of $f(\boldsymbol{\phi}) = |\mathbf{c}^H \boldsymbol{\phi}|^2$ at $\boldsymbol{\phi} = e^{j\arg \mathbf{c}}$ (element-wise). Show it is zero.

Show Hint

At the optimum, the Euclidean gradient is parallel to $\boldsymbol{\phi}$ .

Solution

Euclidean gradient

$\nabla f = 2(\mathbf{c}^H \boldsymbol{\phi})\mathbf{c}$ . At $\boldsymbol{\phi} = e^{j\arg \mathbf{c}}$ : $\mathbf{c}^H \boldsymbol{\phi} = \sum_n c_n^* e^{j\arg c_n} = \sum_n |c_n| = \|\mathbf{c}\|_1$ . So $\nabla f = 2\|\mathbf{c}\|_1 \mathbf{c}$ .

Projection to tangent space

$\text{grad}_{\mathcal{M}} f = \nabla f - \Re(\boldsymbol{\phi}^* \odot \nabla f) \odot \boldsymbol{\phi}$ . Component-wise: $\phi_n^* \cdot [\nabla f]_n = e^{-j\arg c_n} \cdot 2\|\mathbf{c}\|_1 c_n = 2\|\mathbf{c}\|_1 |c_n|$ , real. So $\Re(\phi_n^* \cdot [\nabla f]_n) = 2\|\mathbf{c}\|_1 |c_n|$ , and $[\text{grad}_{\mathcal{M}} f]_n = 2\|\mathbf{c}\|_1 c_n - 2\|\mathbf{c}\|_1 |c_n| e^{j\arg c_n} = 0$ .

Conclusion

Riemannian gradient is zero → $\boldsymbol{\phi}^\star = e^{j\arg \mathbf{c}}$ is a stationary point, confirming the matched-filter optimum. $\blacksquare$

ex-ris-ch06-06

Medium

Derive the element-wise update rule for the multi-user quadratic $f(\boldsymbol{\phi}) = \boldsymbol{\phi}^H \mathbf{A} \boldsymbol{\phi}$ with general PSD $\mathbf{A}$ .

Show Hint

Expand $f$ as a function of $\phi_n$ only.

Solution

Separate $\phi_n$

$f = \sum_{m,k} A_{m,k} \phi_m^* \phi_k$ . Separate $\phi_n$ : $f = A_{nn} |\phi_n|^2 + 2\Re(\phi_n^* \sum_{m \neq n} A_{nm}^* \phi_m^*) + \text{const}$ . Since $|\phi_n| = 1$ , $A_{nn}|\phi_n|^2 = A_{nn}$ (constant). The $\phi_n$ -dependent part is $2\Re(\phi_n^* \alpha_n)$ where $\alpha_n = \sum_{m \neq n} A_{nm}^* \phi_m^*$ .

Maximize

$\Re(\phi_n^* \alpha_n)$ is maximized at $\phi_n^\star = e^{j\arg(\alpha_n)}$ (same phase alignment argument as in Ex. 5.5).

Computational note

Computing $\alpha_n$ is $O(N)$ per coordinate, so a full sweep is $O(N^2)$ . For low-rank $\mathbf{A}$ , tracking the product $\mathbf{A}\boldsymbol{\phi}$ incrementally reduces the cost to $O(Nr)$ where $r = \text{rank}(\mathbf{A})$ . $\blacksquare$

ex-ris-ch06-07

Hard

Construct an example where element-wise BCD converges to a strictly suboptimal local maximum (not the global maximum).

Show Hint

Take $N = 2$ and a rank-2 $\mathbf{A}$ chosen so the objective landscape has multiple local maxima.

Solution

Construct $\mathbf{A}$

Let $\mathbf{A} = \text{diag}(1, -1)$ — this has eigenvalues $\pm 1$ , so it's indefinite, not PSD. For a PSD example: $\mathbf{A} = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}$ — both coordinate-stationary points are $\phi_1 = \phi_2$ . Global max at $\phi_1 = \phi_2 = 1$ is the unique maximum; element-wise from $\boldsymbol{\phi}^{(0)} = [1, -1]$ would need to move to $\phi_2 = 1$ which is what the update rule gives. No strict local optima for this case.

A better example

Take $\mathbf{A}$ with $A_{12} = A_{21} = 0$ (block-diagonal). Then coordinate updates don't interact: each $\phi_n$ is matched-aligned with the corresponding $\mathbf{A}$ -diagonal. If $\mathbf{A} = \text{diag}(a_1, a_2)$ , update is trivial: $\phi_n^\star$ is any unit-modulus (the objective doesn't depend on $\phi_n$ 's phase for diagonal $\mathbf{A}$ ).

Higher-rank non-convex case

Try $N = 4$ with $\mathbf{A}$ having off-diagonal terms that create a "saddle landscape": some coordinate moves increase the objective, others decrease. Random initialization can land in different basins, converging to different local maxima. The full analysis uses the Hessian signature; see Yu et al. (2020) for examples with documented local-opt gaps. $\blacksquare$

ex-ris-ch06-08

Medium

Why does Gaussian randomization with $L = 1000$ samples effectively achieve the $(\pi/4)$ -approximation guarantee?

Show Hint

Consider the CDF of the single-sample rounded objective.

Solution

Single-sample distribution

$f(\boldsymbol{\tilde{\phi}}_\ell)$ for a random $\mathbf{z}_\ell$ is a random variable with mean $\geq (\pi/4) f_{\text{SDR}}$ .

Max of $L$ samples

By standard concentration arguments, the max of $L$ i.i.d. samples from a distribution with mean $\mu$ concentrates near the essential supremum of the distribution. For sub-Gaussian-like distributions, the gap to the supremum is $O(\sqrt{\log L})$ — at $L = 1000$ , roughly 2.6 standard deviations from the mean.

Practical factor

The empirical peak at $L = 1000$ is typically within $1$ - $5\%$ of $f_{\text{SDR}}$ , much better than the worst-case $\pi/4 \approx 78.5\%$ . The theoretical bound is a worst case; most problem instances deliver higher. $\blacksquare$

ex-ris-ch06-09

Medium

Implement the element-wise BCD as pseudocode for a sum-rate multi-user problem where the objective is $f(\boldsymbol{\phi}) = \sum_k |h_{k,d}^H \mathbf{v}_{k} + \boldsymbol{\phi}^T \mathbf{b}_{k,k}|^2 - \sum_{j \neq k} |h_{k,d}^H \mathbf{v}_{j} + \boldsymbol{\phi}^T \mathbf{b}_{k,j}|^2 \cdot w_k$ with fixed weights $w_k$ .

Show Hint

The coordinate-wise function is still a quadratic in $\phi_n$ ; argmax is phase matching.

Solution

Coordinate function

For fixed $\boldsymbol{\phi}_{-n}$ , the dependence on $\phi_n$ is a sum of squared moduli; each $|h_d + \boldsymbol{\phi}^T \mathbf{b}|^2$ contributes a quadratic in $\phi_n$ .

Combine

The $\phi_n$ -dependent part collects to $A_n |\phi_n|^2 + 2\Re(\phi_n^* B_n) + \text{const}$ , where $A_n, B_n$ are computable from the current state. Under $|\phi_n| = 1$ , the $A_n$ term is constant, and $\phi_n^\star = e^{j\arg B_n}$ .

Write out $B_n$

$B_n = \sum_k w_k [\text{signal derivs}] - \sum_k \sum_{j \neq k} w_k [\text{interference derivs}]$ . Each term is $O(1)$ per coordinate pair, so $O(K)$ per coordinate, $O(NK)$ per sweep. $\blacksquare$

ex-ris-ch06-10

Medium

Show that the retraction $R_{\boldsymbol{\phi}}(\boldsymbol{\xi}) = (\boldsymbol{\phi} + \boldsymbol{\xi}) / |\boldsymbol{\phi} + \boldsymbol{\xi}|$ (element-wise) maps tangent vectors back to the complex unit torus.

Show Hint

Verify that the result has $|R_n| = 1$ for each $n$ .

Solution

Component-wise check

$[R_{\boldsymbol{\phi}}(\boldsymbol{\xi})]_n = (\phi_n + \xi_n) / |\phi_n + \xi_n|$ is a complex number divided by its own modulus, so $|[R_{\boldsymbol{\phi}}(\boldsymbol{\xi})]_n| = 1$ whenever $|\phi_n + \xi_n| > 0$ .

Existence check

$|\phi_n + \xi_n| = 0$ iff $\xi_n = -\phi_n$ . For tangent vectors with $|\xi_n| < 1$ , this is impossible. The retraction is well-defined on the tangent bundle restricted to bounded tangent vectors.

First-order accuracy

For small $\boldsymbol{\xi}$ , $R_{\boldsymbol{\phi}}(\boldsymbol{\xi}) \approx \boldsymbol{\phi} + P_{T_{\boldsymbol{\phi}}\mathcal{M}}\boldsymbol{\xi}$ — the tangent-space projection of $\boldsymbol{\xi}$ . This is the defining property of a first-order retraction. $\blacksquare$

ex-ris-ch06-11

Hard

Prove that Armijo backtracking on the manifold with step size halving guarantees monotone ascent: $f(R_{\boldsymbol{\phi}}(-\alpha \boldsymbol{\eta})) > f(\boldsymbol{\phi})$ for some $\alpha > 0$ whenever $\boldsymbol{\eta} = \text{grad}_{\mathcal{M}} f \neq 0$ .

Show Hint

Use the first-order Taylor expansion of $f$ along the retraction curve.

Solution

Taylor expansion

$f(R_{\boldsymbol{\phi}}(-\alpha \boldsymbol{\eta})) = f(\boldsymbol{\phi}) - \alpha \|\boldsymbol{\eta}\|^2 + O(\alpha^2)$ . (Using the fact that the Riemannian gradient defines the first-order change along the retraction.)

Sufficient decrease

Pick $c \in (0, 1)$ . The Armijo condition asks $f(R_{\boldsymbol{\phi}}(-\alpha \boldsymbol{\eta})) \geq f(\boldsymbol{\phi}) + c \alpha \|\boldsymbol{\eta}\|^2$ . By the Taylor expansion, this holds for all small enough $\alpha$ : the LHS is $f(\boldsymbol{\phi}) - \alpha \|\boldsymbol{\eta}\|^2 + O(\alpha^2)$ , which is $\geq f(\boldsymbol{\phi}) - c \alpha \|\boldsymbol{\eta}\|^2$ when $\alpha$ is small.

Halving converges

If the Armijo condition fails at $\alpha$ , try $\alpha/2, \alpha/4, \ldots$ . By the Taylor expansion, the condition must hold at some $\alpha > 0$ , so halving terminates in finitely many steps. $\blacksquare$

ex-ris-ch06-12

Medium

Compare the number of iterations needed by SDR, manifold, and element-wise to converge to $\epsilon = 10^{-3}$ objective precision on a typical problem.

Show Hint

SDR: one solve. Manifold: ~100 iterations. Element-wise: ~5-10 sweeps.

Solution

SDR

One SDP solve returns the optimum in $\leq 1$ precision call. Total: 1 "iteration" (but expensive).

Manifold

Linear convergence; each iteration halves the gap. To get from 100% gap to 0.1% gap: $\log_2(1000) \approx 10$ iterations. In practice, 50-200 iterations due to the constant factors.

Element-wise

One sweep reduces the gap to the local-optimum sub-level set. Subsequent sweeps converge geometrically fast. Typically 3-5 sweeps suffice. $\blacksquare$

ex-ris-ch06-13

Hard

Compare SDR and manifold quality when rank $(\mathbf{A}) = 2$ (e.g., two-user sum rate). Quantify the typical gap.

Show Hint

Empirical observation: manifold within 1% of SDR for rank 2.

Solution

Quality argument

Manifold converges to a local stationary point; SDR provides a tight upper bound. For rank-2, the SDR upper bound is typically achieved by a "rank-2 lift" that is not directly feasible, but Gaussian randomization with $L = 1000$ gets within $1\%$ of the bound.

Manifold quality

Manifold with random multi-start (5 trials) typically matches SDR-randomization. With warm-starting from the previous coherence block, even better: $< 0.1\%$ gap.

Conclusion

At rank 2, manifold is effectively as good as SDR at a fraction of the cost. The gap widens at rank $> 4$ but rarely exceeds $3\%$ in practice. $\blacksquare$

ex-ris-ch06-14

Challenge

Open-ended: Design a warm-starting protocol for manifold optimization across RIS coherence blocks that accounts for block-to-block channel correlation.

Show Hint

Use the previous $\boldsymbol{\phi}^\star$ as initial point; reduce step size; reduce iteration count.

Solution

Warm-start initialization

At coherence block $t$ , initialize $\boldsymbol{\phi}_t^{(0)} = \boldsymbol{\phi}_{t-1}^\star$ .

Adaptive parameters

Given the block-to-block channel $\|\Delta \mathbf{h}_t\| / \|\mathbf{h}_t\|$ (a correlation indicator):

If small (slow motion): use 5 manifold iterations, small step size.
If large (fast motion): use 30 iterations, bigger step size, re-initialize from random.

Implementation

Monitor the objective improvement per iteration; if improvement slows below a threshold, exit. Typical speedup: $3$ - $5\times$ vs. cold-start.

Safeguards

If the channel changes suddenly (blockage event), the warm-started $\boldsymbol{\phi}$ may be far from the new optimum. Detect via sudden drop in $f$ ; fall back to cold-start random initialization. $\blacksquare$

ex-ris-ch06-15

Medium

For $N = 32, K = 4$ , $N_t = 8, \text{SNR} = 20\text{ dB}$ , estimate the achieved rate improvement from SDR vs. element-wise, in bits/s/Hz.

Show Hint

Typical gap at rank 4: $\sim 2$ - $3$ dB in SNR, $\sim 0.5$ - $1$ bit/s/Hz in rate.

Solution

Gap in SNR

Rank 4, moderate $N$ : expect $\sim 2$ dB SDR advantage over element-wise local optimum.

Gap in rate

$R = \log_2(1 + \text{SNR})$ at high SNR ≈ $\log_2 \text{SNR}$ . 2 dB of SNR is a factor of $10^{0.2} \approx 1.58$ . Rate difference: $\log_2(1.58) \approx 0.66$ bits/s/Hz per user.

Total

Per-user $\sim 0.5$ bit/s/Hz; $K = 4$ users → sum-rate advantage $\sim 2$ bits/s/Hz for SDR over element-wise. Significant enough to justify SDR for research, too small to justify the $1000\times$ compute cost in deployment — use element-wise with warm-start for production. $\blacksquare$

Exercises

ex-ris-ch06-01

Bound

Equality

ex-ris-ch06-02

Primal

Dual

Strong duality

ex-ris-ch06-03

Distribution of $Z$

Approximation ratio

Reference

ex-ris-ch06-04

Compute

Interpretation

ex-ris-ch06-05

Euclidean gradient

Projection to tangent space

Conclusion

ex-ris-ch06-06

Separate $\phi_n$

Maximize

Computational note

ex-ris-ch06-07

Construct $\mathbf{A}$

A better example

Higher-rank non-convex case

ex-ris-ch06-08

Single-sample distribution

Max of $L$ samples

Practical factor

ex-ris-ch06-09

Coordinate function

Combine

Write out $B_n$

ex-ris-ch06-10

Component-wise check

Existence check

First-order accuracy

ex-ris-ch06-11

Taylor expansion

Sufficient decrease

Halving converges

ex-ris-ch06-12

SDR

Manifold

Element-wise

ex-ris-ch06-13

Quality argument

Manifold quality

Conclusion

ex-ris-ch06-14

Warm-start initialization

Adaptive parameters

Implementation

Safeguards

ex-ris-ch06-15

Gap in SNR

Gap in rate

Total