Ferkans — Interactive Telecom Tutor

ex26-01-gaussian-psd

Easy

Show that the covariance parameterisation $\boldsymbol{\Sigma} = \mathbf{R}\mathbf{S}\mathbf{S}^\mathsf{T}\mathbf{R}^\mathsf{T}$ always produces a positive semi-definite matrix, regardless of $\mathbf{R} \in \text{SO}(3)$ and diagonal $\mathbf{S}$ .

Show Hint

Consider $\mathbf{x}^\mathsf{T}\boldsymbol{\Sigma}\mathbf{x}$ for arbitrary $\mathbf{x}$ .

Solution

Expand the quadratic form

$\mathbf{x}^\mathsf{T}\boldsymbol{\Sigma}\mathbf{x} = \mathbf{x}^\mathsf{T}\mathbf{R}\mathbf{S}\mathbf{S}^\mathsf{T}\mathbf{R}^\mathsf{T}\mathbf{x} = \|\mathbf{S}^\mathsf{T}\mathbf{R}^\mathsf{T}\mathbf{x}\|^2 \geq 0.$ $

Conclude

Since $\|\mathbf{y}\|^2 \geq 0$ for any vector $\mathbf{y}$ , the matrix $\boldsymbol{\Sigma}$ is PSD for any rotation $\mathbf{R}$ and any scale $\mathbf{S}$ . It is strictly positive definite when all diagonal entries of $\mathbf{S}$ are nonzero. $\blacksquare$

ex26-02-alpha-compositing

Easy

Consider three Gaussians with opacities $\alpha_1 = 0.8$ , $\alpha_2 = 0.6$ , $\alpha_3 = 0.9$ and features $f_1 = 1.0$ , $f_2 = 0.5$ , $f_3 = 0.2$ , all evaluated at a pixel where $G_k(\mathbf{u}) = 1$ for all $k$ (the pixel is at the centre of each Gaussian). Compute the rendered value using alpha compositing.

Show Hint

Apply the front-to-back compositing formula.

Compute $T_k = \prod_{j<k}(1 - \alpha_j G_j(\mathbf{u}))$ for each $k$ .

Solution

Compute transmittances

$T_1 = 1$ (no preceding Gaussians)
$T_2 = 1 - \alpha_1 \cdot 1 = 0.2$
$T_3 = (1 - \alpha_1)(1 - \alpha_2) = 0.2 \times 0.4 = 0.08$

Compute the rendered value

$\hat{C} = f_1 \alpha_1 T_1 + f_2 \alpha_2 T_2 + f_3 \alpha_3 T_3KATEXPLACEHOLDER0END= 1.0 \times 0.8 \times 1 + 0.5 \times 0.6 \times 0.2 + 0.2 \times 0.9 \times 0.08KATEXPLACEHOLDER1END= 0.8 + 0.06 + 0.0144 = 0.8744.$ $The first Gaussian dominates because it is closest to the camera and has high opacity.$ \blacksquare$

ex26-03-quaternion

Easy

A 3D Gaussian has its rotation stored as the unit quaternion $\mathbf{q} = (1, 0, 0, 0)$ . What rotation matrix $\mathbf{R}$ does this correspond to? What happens to the Gaussian's shape?

Show Hint

The identity quaternion $(1, 0, 0, 0)$ corresponds to zero rotation.

Solution

Convert quaternion to rotation matrix

The quaternion $\mathbf{q} = (w, x, y, z) = (1, 0, 0, 0)$ gives $\mathbf{R} = \mathbf{I}_3$ (the identity matrix), since:

$\mathbf{R} = \begin{pmatrix} 1 - 2(y^2 + z^2) & 2(xy - wz) & 2(xz + wy) \\ 2(xy + wz) & 1 - 2(x^2 + z^2) & 2(yz - wx) \\ 2(xz - wy) & 2(yz + wx) & 1 - 2(x^2 + y^2) \end{pmatrix} = \mathbf{I}_3.$

Interpret

With $\mathbf{R} = \mathbf{I}$ , the covariance becomes $\boldsymbol{\Sigma} = \mathbf{S}\mathbf{S}^\mathsf{T} = \text{diag}(s_x^2, s_y^2, s_z^2)$ . The Gaussian is axis-aligned with semi-axes along the coordinate axes. No rotation is applied. $\blacksquare$

ex26-04-db-loss

Easy

A measurement location has true received power $P = -80$ dBm. Two models predict $\hat{P}_1 = -77$ dBm and $\hat{P}_2 = -83$ dBm. Compute the MSE loss in both dB scale and linear scale for each prediction, and explain why dB-scale loss is preferred.

Show Hint

Convert dBm to mW: $P_{\text{mW}} = 10^{P_{\text{dBm}}/10}$ .

Solution

dB-scale loss

Model 1: $(\hat{P}_1 - P)^2 = (-77 - (-80))^2 = 9$ dB $^2$
Model 2: $(\hat{P}_2 - P)^2 = (-83 - (-80))^2 = 9$ dB $^2$

Both predictions have equal error in dB scale.

Linear-scale loss

Converting: $P = 10^{-8}$ mW, $\hat{P}_1 = 10^{-7.7} \approx 2 \times 10^{-8}$ mW, $\hat{P}_2 = 10^{-8.3} \approx 5 \times 10^{-9}$ mW.

Model 1: $(\hat{P}_1 - P)^2 = (2 \times 10^{-8} - 10^{-8})^2 = 10^{-16}$
Model 2: $(\hat{P}_2 - P)^2 = (5 \times 10^{-9} - 10^{-8})^2 = 2.5 \times 10^{-17}$

Model 2 has $4\times$ smaller loss in linear scale, despite having the same absolute error in dB. Linear-scale loss is dominated by high-power regions, ignoring errors at low power. $\blacksquare$

ex26-05-2d-projection

Medium

Derive the 2D projected covariance $\boldsymbol{\Sigma}'$ of a 3D Gaussian with mean $\boldsymbol{\mu}$ and covariance $\boldsymbol{\Sigma}$ under a perspective camera with projection matrix $\mathbf{P}$ . Show that the projection of an anisotropic 3D Gaussian is an anisotropic 2D Gaussian (ellipse) on the image plane.

Show Hint

Use the Jacobian of the projection at the Gaussian centre.

The projected distribution is obtained by marginalising over depth.

Solution

Linearise the projection

Let $\pi: \mathbb{R}^3 \to \mathbb{R}^2$ be the perspective projection. The Jacobian at $\boldsymbol{\mu}$ is $\mathbf{J} = \frac{\partial \pi}{\partial \mathbf{p}}\big|_{\boldsymbol{\mu}} \in \mathbb{R}^{2 \times 3}$ .

Transform the covariance

Under the linear approximation (EWA splatting), the projected covariance is:

$\boldsymbol{\Sigma}' = \mathbf{J}\mathbf{W}\boldsymbol{\Sigma}\mathbf{W}^\mathsf{T}\mathbf{J}^\mathsf{T} \in \mathbb{R}^{2 \times 2},$

where $\mathbf{W}$ is the world-to-camera rotation.

Result is a 2D Gaussian

Since $\boldsymbol{\Sigma}$ is PSD and $\mathbf{J}\mathbf{W}$ is a $2 \times 3$ matrix with rank 2 (for non-degenerate cameras), $\boldsymbol{\Sigma}'$ is a $2 \times 2$ PSD matrix. The projected density is therefore a 2D Gaussian with elliptical level sets, centred at $\pi(\boldsymbol{\mu})$ . $\blacksquare$

ex26-06-sh-isotropic

Medium

Show that a spherical harmonic expansion truncated at order $L = 0$ yields an isotropic (omnidirectional) scatterer, and compute the minimum order $L$ needed to represent a specular reflector with beamwidth $\Delta\theta$ .

Show Hint

$Y_0^0 = 1/(2\sqrt{\pi})$ is constant over the sphere.

The angular resolution of degree- $L$ SH is $\sim \pi/L$ .

Solution

L=0 case

For $L = 0$ : $\hat{P}(\hat{\mathbf{d}}) = a_{0,0} Y_0^0 = a_{0,0}/(2\sqrt{\pi})$ , which is constant over all directions. This is an omnidirectional (isotropic) scatterer.

Minimum L for specular pattern

The angular resolution of spherical harmonics of degree $L$ is approximately $\Delta\theta_{\text{SH}} \sim \pi/L$ . To represent a specular reflection with beamwidth $\Delta\theta$ , we need:

$L \geq \frac{\pi}{\Delta\theta}.$

For example, a specular reflector with $10^\circ$ beamwidth requires $L \geq \pi/(10\pi/180) = 18$ . In practice, $L = 3$ -- $4$ suffices for typical RF scatterers because RF scattering patterns are broader than optical ones. $\blacksquare$

ex26-07-densification

Medium

In the 3DGS adaptive density control, a Gaussian is cloned when the positional gradient exceeds a threshold: $\|\partial\mathcal{L}/\partial\boldsymbol{\mu}_k\| > \tau_\mu$ . Explain geometrically what a large positional gradient indicates about the reconstruction quality at that location, and why cloning (adding a new Gaussian nearby) is the appropriate response.

Show Hint

The positional gradient points in the direction that would reduce the loss most.

A large gradient means the current Gaussian position is suboptimal.

Solution

Interpret the gradient

The positional gradient $\partial\mathcal{L}/\partial\boldsymbol{\mu}_k$ indicates the direction in which moving the Gaussian centre would most reduce the rendering loss. A large magnitude means the Gaussian is being "pulled" strongly --- it needs to cover more area than its current position and shape allow.

Why cloning helps

If a single Gaussian cannot adequately represent the local scene structure (e.g., a corner, an edge, or a transition between materials), duplicating it allows two Gaussians to share the responsibility. One stays near the current position; the clone shifts toward the gradient direction. After further optimisation, the two Gaussians settle into positions that jointly cover the under-reconstructed region.

Splitting vs cloning

Cloning is used for small Gaussians in under-represented regions; splitting is used for large Gaussians covering too much area. The distinction is controlled by the scale threshold $\tau_s$ . $\blacksquare$

ex26-08-measurement-density

Medium

For an RF-3DGS reconstruction at $f_0 = 28$ GHz, compute: (a) the wavelength $\lambda$ , (b) the recommended measurement spacing ( $\sim 2\lambda$ ), (c) the number of measurements needed for a $10 \times 10$ m room. Discuss practical feasibility.

Show Hint

$\lambda = c/f_0$ where $c = 3 \times 10^8$ m/s.

Solution

Compute wavelength

$\lambda = \frac{c}{f_0} = \frac{3 \times 10^8}{28 \times 10^9} \approx 1.07 \text{ cm}.$ $

Measurement spacing

Recommended spacing: $2\lambda \approx 2.1$ cm.

Number of measurements

For a $10 \times 10$ m area at 2.1 cm spacing: $(10/0.021) \times (10/0.021) \approx 476 \times 476 \approx 226{,}000$ measurements.

Feasibility

Over 200,000 measurements is impractical for manual or even robotic data collection. This highlights why visual priors (RFCanvas) or coarser sampling with interpolation are essential at mmWave frequencies. Alternatively, phased-array beam scanning can acquire directional measurements more efficiently. $\blacksquare$

ex26-09-visual-prior

Medium

In the RFCanvas framework, the geometric parameters are frozen during RF fine-tuning. Compute the ratio of frozen to total parameters for a scene with $N = 1000$ Gaussians and SH order $L = 2$ .

Show Hint

Geometric parameters per Gaussian: 3 (position) + 4 (quaternion) + 3 (scale) = 10.

RF parameters per Gaussian: 1 (opacity) + $(L+1)^2$ (SH) = ?

Solution

Count parameters

Geometric per Gaussian: $3 + 4 + 3 = 10$
RF per Gaussian: $1 + (2+1)^2 = 1 + 9 = 10$
Total per Gaussian: $20$
Total scene: $1000 \times 20 = 20{,}000$
Frozen (geometric): $1000 \times 10 = 10{,}000$

Ratio

Frozen/total $= 10{,}000/20{,}000 = 50\%$ .

Half the parameters are frozen, meaning the RF optimisation operates in a 10,000-dimensional subspace. With 20 measurements, this is still severely under-determined ( $20 \ll 10{,}000$ ), which is why additional regularisation (smoothness, material-based priors) is important. $\blacksquare$

ex26-10-gradient-splatting

Hard

Derive the gradient of the rendering loss $\mathcal{L} = \|\hat{C}(\mathbf{u}) - C_{\text{target}}(\mathbf{u})\|^2$ with respect to the opacity $\alpha_k$ of the $k$ -th Gaussian, where $\hat{C}$ is the alpha-compositing output. Show that the gradient depends on all Gaussians closer to the camera through the transmittance $T_k$ .

Show Hint

Write $\hat{C} = \sum_i f_i \alpha_i G_i T_i$ and note that $T_k = \prod_{j<k}(1 - \alpha_j G_j)$ .

The gradient has a direct term (from the $k$ -th summand) and indirect terms (from $T_i$ for $i > k$ ).

Solution

Direct contribution

The $k$ -th Gaussian contributes $f_k \alpha_k G_k T_k$ to $\hat{C}$ . The direct gradient is:

$\frac{\partial \hat{C}}{\partial \alpha_k}\bigg|_{\text{direct}} = f_k G_k T_k.$

Indirect contributions

For $i > k$ , the transmittance $T_i$ depends on $\alpha_k$ : $\partial T_i / \partial \alpha_k = -G_k \prod_{j<i, j\neq k}(1-\alpha_j G_j)$ . This gives:

$\frac{\partial \hat{C}}{\partial \alpha_k}\bigg|_{\text{indirect}} = -G_k \sum_{i>k} f_i \alpha_i G_i \frac{T_i}{1 - \alpha_k G_k}.$

Total gradient

$\frac{\partial \hat{C}}{\partial \alpha_k} = f_k G_k T_k - G_k \sum_{i>k} f_i \alpha_i G_i \frac{T_i}{1 - \alpha_k G_k}.$ $The gradient of the loss follows by the chain rule:$ \partial\mathcal{L}/\partial\alpha_k = 2(\hat{C} - C_{\text{target}}) \cdot \partial\hat{C}/\partial\alpha_k $. The dependence on$ T_k $(and hence on all preceding opacities) means that gradients flow through the depth ordering, coupling all Gaussians along each ray.$ \blacksquare$

ex26-11-coherent-radar

Hard

Show that for two Gaussians at ranges $R_1$ and $R_2$ with $|R_1 - R_2| < c/(2W)$ (within the same range cell), the coherent radar return is:

$|s_1 + s_2|^2 = \sigma_1^2 + \sigma_2^2 + 2\sigma_1\sigma_2\cos(2\kappa(R_1 - R_2)),$

and that the incoherent approximation $\sigma_1^2 + \sigma_2^2$ can differ by up to a factor of 4 (6 dB).

Show Hint

Write $s_k = \sigma_k e^{j2\kappa R_k}$ and expand $|s_1 + s_2|^2$ .

Solution

Coherent sum

$|s_1 + s_2|^2 = |\sigma_1 e^{j\phi_1} + \sigma_2 e^{j\phi_2}|^2KATEXPLACEHOLDER0END= \sigma_1^2 + \sigma_2^2 + 2\sigma_1\sigma_2\cos(\phi_1 - \phi_2).$ $

Extreme cases

Constructive: $\cos(\Delta\phi) = 1 \Rightarrow |s|^2 = (\sigma_1 + \sigma_2)^2$ .
Destructive: $\cos(\Delta\phi) = -1 \Rightarrow |s|^2 = (\sigma_1 - \sigma_2)^2$ .
Incoherent: $|s|^2 = \sigma_1^2 + \sigma_2^2$ .

For $\sigma_1 = \sigma_2 = \sigma$ : constructive gives $4\sigma^2$ , incoherent gives $2\sigma^2$ , ratio $= 2$ (3 dB). For the full range: constructive/destructive ratio is $(2\sigma)^2/0 = \infty$ (cancelled completely). $\blacksquare$

ex26-12-kronecker-gaussian

Hard

Show that when the sensing operator has Kronecker structure $\mathbf{A} = \mathbf{A}_{\text{space}} \otimes \mathbf{A}_{\text{freq}}$ , the Gaussian splatting forward pass can be decomposed into separate spatial and frequency splatting operations, reducing the computational cost from $O(NKQ)$ to $O(NK + NQ)$ .

Show Hint

The Kronecker structure means the measurement at subcarrier $k$ and spatial sample $q$ factorises.

Each Gaussian contributes independently along spatial and frequency dimensions.

Solution

Kronecker forward model

The measurement at subcarrier $k$ and spatial position $q$ is:

$y_{k,q} = \sum_{n=1}^N c_n [\mathbf{A}_{\text{freq}}]_{k,n} [\mathbf{A}_{\text{space}}]_{q,n} + w_{k,q}.$

Gaussian representation

Representing $c_n$ via Gaussians: $c_n = \sum_{i=1}^{N_G} p_i G_i(\mathbf{p}_{n})$ , the measurement becomes:

$y_{k,q} = \sum_{i=1}^{N_G} p_i \underbrace{\left(\sum_n G_i(\mathbf{p}_{n}) [\mathbf{A}_{\text{freq}}]_{k,n}\right)}_{\text{freq. splatting}} \underbrace{[\mathbf{A}_{\text{space}}]_{q,\cdot}}_{\text{spatial splatting}}.$

Cost reduction

The frequency splatting requires $O(N_G \cdot K)$ operations and the spatial splatting $O(N_G \cdot Q)$ , compared to $O(N_G \cdot K \cdot Q)$ for the joint operation. The total cost is $O(N_G(K + Q))$ rather than $O(N_G K Q)$ . $\blacksquare$

ex26-13-pruning

Hard

Consider a 3DGS scene with $N = 10{,}000$ Gaussians after training. Suppose we prune all Gaussians with $\alpha_k < \epsilon_\alpha$ . Derive an upper bound on the rendering error introduced by pruning, in terms of $\epsilon_\alpha$ and the maximum feature magnitude $f_{\max} = \max_k \|f_k\|$ .

Show Hint

Each pruned Gaussian contributes at most $f_k \alpha_k G_k T_k \leq f_{\max} \epsilon_\alpha$ to any pixel.

At most $N$ Gaussians can be pruned.

Solution

Bound per-Gaussian contribution

A pruned Gaussian $k$ with $\alpha_k < \epsilon_\alpha$ contributes at most $\|f_k\| \cdot \alpha_k \cdot G_k(\mathbf{u}) \cdot T_k \leq f_{\max} \cdot \epsilon_\alpha$ to any pixel $\mathbf{u}$ (since $G_k \leq 1$ and $T_k \leq 1$ ).

Bound total error

If $N_p$ Gaussians are pruned, the worst-case pixel error is:

$|\hat{C}_{\text{pruned}}(\mathbf{u}) - \hat{C}(\mathbf{u})| \leq N_p \cdot f_{\max} \cdot \epsilon_\alpha.$

For $N_p = 5000$ , $f_{\max} = 1$ , $\epsilon_\alpha = 0.01$ : error $\leq 50$ , which is a loose bound. In practice, the transmittance $T_k$ ensures that deeply occluded pruned Gaussians contribute negligibly, making the actual error much smaller. $\blacksquare$

ex26-14-convergence

Challenge

The 3DGS optimisation minimises $\mathcal{L}(\Theta) = \sum_{i=1}^M \|\hat{I}_i(\Theta) - I_i\|^2$ over the Gaussian parameters $\Theta$ . This is a non-convex optimisation due to the sorting, alpha-compositing, and projection operations. Identify three specific sources of non-convexity and discuss under what conditions gradient descent is likely to find a good (if not global) minimum.

Show Hint

Consider the depth ordering (sorting), the product in $T_k$ , and the quaternion parameterisation.

Think about local vs global minima and symmetries of the representation.

Solution

Source 1: Depth sorting

The alpha-compositing depends on the depth order of Gaussians. Swapping the order of two Gaussians with similar depth creates a discontinuity in the rendering function. This makes the loss landscape non-smooth (not just non-convex). In practice, the sorting is treated as fixed during each gradient step and updated periodically.

Source 2: Transmittance product

The transmittance $T_k = \prod_{j<k}(1 - \alpha_j G_j)$ is a product of terms, each depending on different parameters. This creates coupled non-linear interactions between Gaussians --- the gradient of one Gaussian's opacity depends on all preceding Gaussians' opacities.

Source 3: Rotation parameterisation

The quaternion-to-rotation conversion $\mathbf{q} \to \mathbf{R}$ is non-linear, and the rotation group SO(3) is non-convex. Additionally, antipodal quaternions $\mathbf{q}$ and $-\mathbf{q}$ represent the same rotation, creating symmetries in the loss landscape.

Why gradient descent works

Despite non-convexity, 3DGS optimisation works well because: (1) The initialisation from SfM or a grid provides a warm start near a good basin. (2) Adaptive density control adds/removes parameters, exploring the loss landscape more broadly than fixed-parameter optimisation. (3) The per-Gaussian parameters are largely decoupled in non-overlapping regions. $\blacksquare$

ex26-15-fundamental-limits

Challenge

Consider the problem of recovering $N$ Gaussian parameters from $M$ RF power measurements. Using information-theoretic arguments, derive a necessary condition on $M$ for identifiability of the Gaussian scene, and compare with the achievability result from compressed sensing (RIP-based recovery).

Show Hint

Each Gaussian has $d$ parameters ( $d \approx 10$ -- $20$ ). The total scene has $Nd$ parameters.

Each measurement provides at most 1 real-valued constraint.

For identifiability, $M \geq Nd$ is necessary. Is it sufficient?

Solution

Necessary condition

The Gaussian scene has $\Theta \in \mathbb{R}^{Nd}$ parameters. Each measurement $P_{\text{dB}}(\ntn{rx_pos}_m)$ provides one scalar equation. For the system to be determined, we need $M \geq Nd$ (parameter counting).

Structured observation

However, the rendering equation is non-linear in $\Theta$ , so the $M$ equations are not independent linear constraints. The effective number of independent constraints depends on the Jacobian rank $\text{rank}(\partial \hat{\mathbf{P}}/\partial \Theta)$ . For well-separated Gaussians, the Jacobian has full rank and $M \geq Nd$ suffices. For overlapping Gaussians, degeneracies reduce the rank.

Comparison with compressed sensing

In compressed sensing with a linear model $\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w}$ , recovery of an $S$ -sparse signal in $\mathbb{C}^Q$ requires $M \geq O(S \log(Q/S))$ measurements under RIP. For Gaussian splatting with $S$ scatterers (each modelled as one Gaussian with $d$ parameters), the analogous requirement is $M \geq O(Sd \log(Q/S))$ --- the per-scatterer cost increases by the factor $d$ due to the additional shape parameters.

Practical implication

For $S = 100$ scatterers in a $Q = 10{,}000$ -voxel space with $d = 20$ : compressed sensing needs $M \geq O(200\,\log\,100) \approx 1{,}000$ measurements; Gaussian splatting needs $M \geq O(2{,}000\,\log\,100) \approx 10{,}000$ in the worst case. The visual prior (RFCanvas) dramatically reduces this by fixing the geometric parameters. $\blacksquare$

Exercises

ex26-01-gaussian-psd

Expand the quadratic form

Conclude

ex26-02-alpha-compositing

Compute transmittances

Compute the rendered value

ex26-03-quaternion

Convert quaternion to rotation matrix

Interpret

ex26-04-db-loss

dB-scale loss

Linear-scale loss

ex26-05-2d-projection

Linearise the projection

Transform the covariance

Result is a 2D Gaussian

ex26-06-sh-isotropic

L=0 case

Minimum L for specular pattern

ex26-07-densification

Interpret the gradient

Why cloning helps

Splitting vs cloning

ex26-08-measurement-density

Compute wavelength

Measurement spacing

Number of measurements

Feasibility

ex26-09-visual-prior

Count parameters

Ratio

ex26-10-gradient-splatting

Direct contribution

Indirect contributions

Total gradient

ex26-11-coherent-radar

Coherent sum

Extreme cases

ex26-12-kronecker-gaussian

Kronecker forward model

Gaussian representation

Cost reduction

ex26-13-pruning

Bound per-Gaussian contribution

Bound total error

ex26-14-convergence

Source 1: Depth sorting

Source 2: Transmittance product

Source 3: Rotation parameterisation

Why gradient descent works

ex26-15-fundamental-limits

Necessary condition

Structured observation

Comparison with compressed sensing

Practical implication