Ferkans — Interactive Telecom Tutor

Why Gaussian Splatting for RF?

In Chapters 24 and 25 we studied implicit neural scene representations (NeRFs, SDFs) and differentiable rendering. These methods achieve remarkable reconstruction quality but suffer from a fundamental limitation: rendering requires hundreds of MLP evaluations per ray, making real-time inference impractical. The question that motivates this chapter is: can we represent RF scenes with an explicit, GPU-friendly primitive that enables both fast training and real-time rendering?

The answer comes from 3D Gaussian Splatting (3DGS), introduced by Kerbl et al. at SIGGRAPH 2023. Instead of encoding the scene in network weights, 3DGS uses a collection of anisotropic 3D Gaussians as explicit primitives. Each Gaussian carries its own position, shape, opacity, and appearance — and rendering reduces to projecting ("splatting") these Gaussians onto the image plane via standard GPU rasterisation pipelines.

Definition:
3D Gaussian Primitive

A 3D Gaussian primitive is defined by the tuple $(\boldsymbol{\mu}, \boldsymbol{\Sigma}, \alpha, \mathbf{f})$ where:

$\boldsymbol{\mu} \in \mathbb{R}^3$ is the centre (mean position),
$\boldsymbol{\Sigma} \in \mathbb{R}^{3 \times 3}$ is a positive-definite covariance matrix encoding shape and orientation,
$\alpha \in [0, 1]$ is the opacity,
$\mathbf{f}$ is a feature vector encoding appearance (colour, or in the RF setting, scattering attributes).

The spatial influence of the primitive is given by the 3D Gaussian density:

$G(\mathbf{p}) = \exp\!\left(-\tfrac{1}{2}(\mathbf{p} - \boldsymbol{\mu})^\mathsf{T} \boldsymbol{\Sigma}^{-1} (\mathbf{p} - \boldsymbol{\mu})\right).$

The covariance is parameterised as $\boldsymbol{\Sigma} = \mathbf{R}\mathbf{S}\mathbf{S}^\mathsf{T}\mathbf{R}^\mathsf{T}$ with rotation matrix $\mathbf{R} \in \text{SO}(3)$ (stored as a unit quaternion) and diagonal scale matrix $\mathbf{S} = \text{diag}(s_x, s_y, s_z)$ .

Definition:
3D Gaussian Splatting Scene Representation

A 3DGS scene consists of $N$ Gaussian primitives:

$\mathcal{G} = \{(\boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k, \alpha_k, \mathbf{f}_k)\}_{k=1}^N.$

The scene is fully characterised by the parameter set $\Theta = \{\boldsymbol{\mu}_k, \mathbf{q}_k, \mathbf{s}_k, \alpha_k, \mathbf{f}_k\}_{k=1}^N$ , where $\mathbf{q}_k \in \mathbb{R}^4$ is the unit quaternion encoding the rotation and $\mathbf{s}_k = (s_x, s_y, s_z)_k$ encodes the scale. Typical scenes use $N \sim 10^5$ to $10^6$ Gaussians for high-quality reconstruction.

Definition:
Differentiable Rasterisation (Splatting)

Differentiable rasterisation renders an image by projecting each 3D Gaussian onto the image plane and compositing in depth order. Given a camera with world-to-camera transform $\mathbf{W}$ and projection Jacobian $\mathbf{J}$ , the 3D covariance projects to a 2D covariance:

$\boldsymbol{\Sigma}' = \mathbf{J}\mathbf{W}\boldsymbol{\Sigma}\mathbf{W}^\mathsf{T}\mathbf{J}^\mathsf{T}.$

The rendered value at pixel $\mathbf{u}$ is:

$\hat{C}(\mathbf{u}) = \sum_{k \in \mathcal{N}(\mathbf{u})} \mathbf{f}_k \, \alpha_k \, G_k(\mathbf{u}) \prod_{j < k} \bigl(1 - \alpha_j \, G_j(\mathbf{u})\bigr),$

where $G_k(\mathbf{u})$ is the 2D Gaussian evaluated at $\mathbf{u}$ , the product runs over Gaussians closer to the camera (front-to-back ordering), and $\mathcal{N}(\mathbf{u})$ is the set of Gaussians whose projected footprint overlaps $\mathbf{u}$ .

The entire pipeline is differentiable: gradients of a photometric loss $\mathcal{L}$ flow back through the compositing and projection to update every Gaussian parameter $\Theta$ .

Theorem: Alpha Compositing as Discretised Volume Rendering

The front-to-back alpha compositing of 3DGS:

$\hat{C}(\mathbf{u}) = \sum_{k=1}^K \mathbf{f}_k \, \alpha_k \, G_k(\mathbf{u}) \prod_{j=1}^{k-1} \bigl(1 - \alpha_j \, G_j(\mathbf{u})\bigr)$

is the Riemann-sum discretisation of the NeRF volume rendering integral (Chapter 24):

$C(\mathbf{r}) = \int_0^\infty T(t) \, \sigma(\mathbf{r}(t)) \, \mathbf{c}(\mathbf{r}(t), \mathbf{d}) \, dt, \quad T(t) = \exp\!\left(-\int_0^t \sigma(\mathbf{r}(s))\,ds\right),$

where each Gaussian's contribution $\alpha_k G_k(\mathbf{u})$ corresponds to the opacity $1 - \exp(-\sigma_k \delta_k)$ at the $k$ -th sample along the ray.

Both NeRF and 3DGS solve the same rendering problem. NeRF samples the density field along each ray and integrates; 3DGS projects analytic Gaussians onto the image plane and sums. The connection is that splatting a Gaussian is equivalent to evaluating the volume rendering integral analytically for a Gaussian density profile.

Proof

Volume rendering discretisation

Partition the ray $\mathbf{r}(t) = \mathbf{o} + t\mathbf{d}$ into intervals $[t_k, t_{k+1})$ centred at the Gaussian means. In each interval, approximate the density as constant: $\sigma(t) \approx \sigma_k$ for $t \in [t_k, t_{k+1})$ with $\delta_k = t_{k+1} - t_k$ .

Transmittance factorisation

The discrete transmittance becomes $T_k = \prod_{j=1}^{k-1} \exp(-\sigma_j \delta_j) = \prod_{j=1}^{k-1}(1 - a_j)$ where $a_j = 1 - \exp(-\sigma_j \delta_j) \approx \sigma_j \delta_j$ for small $\sigma_j \delta_j$ .

Identification with splatting

Identifying $a_k = \alpha_k G_k(\mathbf{u})$ (the product of learnable opacity and projected Gaussian evaluation) and $\mathbf{c}_k = \mathbf{f}_k$ yields the 3DGS compositing formula. $\blacksquare$

,

Definition:
Adaptive Density Control

3DGS uses adaptive density control to add, remove, and split Gaussians during training:

Densification by cloning: Gaussians in under-reconstructed regions (high positional gradient $\|\partial \mathcal{L}/\partial \boldsymbol{\mu}_k\| > \tau_\mu$ ) are cloned — duplicated with a small offset.
Densification by splitting: Large Gaussians covering too much area (scale $\|\mathbf{s}_k\| > \tau_s$ ) are split into two smaller Gaussians.
Pruning: Gaussians with opacity below a threshold $\alpha_k < \epsilon_\alpha$ are removed.
Opacity reset: Periodically, all opacities are reduced to encourage removing unnecessary Gaussians.

This adaptive scheme is critical: it allows the representation to allocate resolution where the scene is complex and remain sparse elsewhere.

3DGS Training Pipeline

Complexity:

O(N \cdot P)

per iteration, where

N

is the number of Gaussians and

P

is the number of pixels. Tile-based rasterisation reduces this to

O(N + T \cdot K)

where

T

is the number of tiles and

K

the average Gaussians per tile.

Input: Multi-view images

\{I_i, \Pi_i\}_{i=1}^M

with camera poses

\Pi_i

;

initial point cloud from SfM (e.g., COLMAP)

Output: Optimised Gaussian set

\mathcal{G}^*

1. Initialise

\mathcal{G}^{(0)}

from SfM points:

\boldsymbol{\mu}_k

= point position,

\mathbf{s}_k

from nearest-neighbour distances,

\alpha_k = 0.1

,

\mathbf{f}_k

from point colour

2. for epoch

= 1, \ldots, E

do

3.

\quad

Sample a training view

i \sim \text{Uniform}(\{1, \ldots, M\})

4.

\quad

Render:

\hat{I}_i = \text{Rasterise}(\mathcal{G}, \Pi_i)

(differentiable splatting)

5.

\quad

Compute loss:

\mathcal{L} = (1 - \lambda_{\text{SSIM}})\|\hat{I}_i - I_i\|_1 + \lambda_{\text{SSIM}} \cdot \mathcal{L}_{\text{SSIM}}(\hat{I}_i, I_i)

6.

\quad

Backpropagate:

\nabla_\Theta \mathcal{L}

through differentiable rasteriser

7.

\quad

Update:

\Theta \leftarrow \Theta - \eta \nabla_\Theta \mathcal{L}

(Adam optimiser)

8.

\quad

if epoch

\bmod D = 0

then

9.

\quad\quad

Apply adaptive density control (clone, split, prune)

10.

\quad

end if

11. end for

The training is an order of magnitude faster than NeRF because rendering uses rasterisation (forward pass through sorted tiles) rather than ray marching (hundreds of MLP queries per ray).

Example: 3DGS vs NeRF --- Rendering Speed Comparison

Compare the rendering speed and training time of 3DGS with the original NeRF and Instant-NGP on a standard benchmark (e.g., Mip-NeRF 360 dataset at $1080$ p resolution).

Solution

Rendering speed

NeRF (original): $\sim 0.03$ FPS. Each pixel requires $\sim 192$ MLP evaluations along the ray.
Instant-NGP: $\sim 10$ FPS using hash-grid encoding and a small MLP.
3DGS: $> 100$ FPS at $1080$ p. The tile-based rasteriser processes all Gaussians in parallel without per-ray MLP queries.

The speedup factor is $> 3000\times$ over the original NeRF.

Training time

NeRF: $12$ -- $24$ hours on a single GPU.
Instant-NGP: $\sim 5$ minutes.
3DGS: $\sim 15$ -- $30$ minutes for $\sim 1$ M Gaussians, competitive with Instant-NGP while achieving higher quality.

Quality

On the Mip-NeRF 360 benchmark, 3DGS achieves PSNR $\sim 33$ dB, comparable to the best NeRF variants. 3DGS handles thin structures and sharp edges better (each Gaussian is independently oriented), while NeRF excels at smooth translucent media.

Historical Note: From EWA Splatting to 3DGS

2000--2023

Point-based rendering and splatting have a long history in computer graphics. Zwicker et al. (2001) introduced EWA (elliptical weighted average) splatting, which projects 3D ellipsoids onto the image plane as 2D ellipses --- essentially the same geometric operation that 3DGS uses. Pfister et al. (2000) developed surfels (surface elements) as oriented disc primitives. What Kerbl et al. (2023) contributed was the combination of (1) differentiable rasterisation enabling gradient-based optimisation, (2) adaptive density control for automatic resolution allocation, and (3) spherical harmonics for view-dependent appearance --- turning a rendering primitive into a learnable scene representation.

The speed advantage of splatting over ray marching was well known in the graphics community. The insight of 3DGS was that this speed advantage could be combined with analysis-through-synthesis optimisation to create an explicit scene representation that rivals the quality of implicit neural representations.

,

2D Gaussian Splatting Demonstration

Visualise how a collection of 2D Gaussians renders an image through alpha compositing. Adjust the number of Gaussians, their scale, and opacity to see how the representation quality changes.

Parameters

Number of Gaussians100

Gaussian scale

\sigma

0.05

Base opacity

\alpha

0.5

Quick Check

In 3D Gaussian Splatting, the covariance matrix $\boldsymbol{\Sigma}_k$ is parameterised as $\boldsymbol{\Sigma}_k = \mathbf{R}_k \mathbf{S}_k \mathbf{S}_k^\mathsf{T} \mathbf{R}_k^\mathsf{T}$ . Why is this parameterisation preferred over directly optimising $\boldsymbol{\Sigma}_k$ ?

It reduces the number of free parameters from 6 to 3

It guarantees that $\boldsymbol{\Sigma}_k$ remains positive semi-definite during optimisation

It allows faster matrix inversion

It enables use of spherical harmonics for view dependence

Correction:

It guarantees that

\boldsymbol{\Sigma}_k

remains positive semi-definite during optimisation

Any matrix of the form $\mathbf{R}\mathbf{S}\mathbf{S}^\mathsf{T}\mathbf{R}^\mathsf{T}$ is automatically positive semi-definite, since $\mathbf{x}^\mathsf{T}\mathbf{R}\mathbf{S}\mathbf{S}^\mathsf{T}\mathbf{R}^\mathsf{T}\mathbf{x} = \|\mathbf{S}^\mathsf{T}\mathbf{R}^\mathsf{T}\mathbf{x}\|^2 \geq 0$ . Directly optimising $\boldsymbol{\Sigma}$ could produce non-PSD matrices, which have no interpretation as a covariance.

Common Mistake: 3DGS Is Not True Volumetric Rendering

Mistake:

Assuming that 3DGS performs exact volumetric integration along each ray, equivalent to NeRF.

Correction:

3DGS uses a rasterisation-based approximation: each Gaussian is projected to 2D and composited in depth order. This is a Riemann-sum approximation to the volume rendering integral, not an exact integration. The approximation can produce artifacts when Gaussians overlap significantly in depth or when the scene has complex occlusion patterns. For RF applications, where the "scene" is a collection of scatterers rather than a continuous density field, this approximation is often acceptable.

Splatting

A rendering technique where 3D primitives (typically ellipsoids or Gaussians) are projected ("splatted") onto a 2D image plane. Each primitive contributes a weighted footprint to the image, and overlapping contributions are composited. Splatting is the dual of ray casting: instead of shooting rays through pixels and querying the scene, the scene projects itself onto the image.

Related: Differentiable Rendering

Differentiable Rendering

A rendering pipeline designed so that the gradient of a loss function (comparing rendered and observed images) can be computed with respect to all scene parameters via backpropagation. This enables gradient-based optimisation of scene geometry, appearance, and camera parameters from image observations alone.

Analysis Through Synthesis

An inverse-problem strategy where scene parameters are estimated by synthesising (rendering) observations from a parameterised model and comparing them to actual measurements. The parameters are then updated to minimise the discrepancy. In the 3DGS context, the Gaussian parameters are the "scene model" and differentiable rasterisation is the "synthesis" step.

Related: Differentiable Rendering

Key Takeaway

3D Gaussian Splatting represents scenes as explicit collections of anisotropic 3D Gaussians, each carrying position, shape, opacity, and appearance attributes. Differentiable rasterisation enables gradient-based optimisation from multi-view images, achieving training times of minutes and rendering at $> 100$ FPS --- orders of magnitude faster than NeRF. The alpha compositing formula is a discretisation of the same volume rendering integral used by NeRF, connecting the two representations theoretically.

3D Gaussian Splatting: Alpha Compositing Pipeline

Three 3D Gaussians are projected onto the image plane and composited front-to-back via alpha blending:

C = \sum_i c_i \alpha_i \prod_{j<i}(1-\alpha_j)

. Each Gaussian contributes opacity and color to the final pixel. This differentiable rasterization pipeline — orders of magnitude faster than volumetric ray marching — is the foundation of RF-3DGS, RFCanvas, and RadarSplat.

3D Gaussian Splatting Recap

Why Gaussian Splatting for RF?

Definition: 3D Gaussian Primitive

Definition: 3D Gaussian Splatting Scene Representation

Definition: Differentiable Rasterisation (Splatting)

Theorem: Alpha Compositing as Discretised Volume Rendering

Volume rendering discretisation

Transmittance factorisation

Identification with splatting

Definition: Adaptive Density Control

3DGS Training Pipeline

Example: 3DGS vs NeRF --- Rendering Speed Comparison

Rendering speed

Training time

Quality

Historical Note: From EWA Splatting to 3DGS

2D Gaussian Splatting Demonstration

Parameters

Quick Check

Common Mistake: 3DGS Is Not True Volumetric Rendering

Splatting

Differentiable Rendering

Analysis Through Synthesis

Key Takeaway

3D Gaussian Splatting: Alpha Compositing Pipeline

Definition:
3D Gaussian Primitive

Definition:
3D Gaussian Splatting Scene Representation

Definition:
Differentiable Rasterisation (Splatting)

Definition:
Adaptive Density Control