3D Gaussian Splatting Recap
Why Gaussian Splatting for RF?
In Chapters 24 and 25 we studied implicit neural scene representations (NeRFs, SDFs) and differentiable rendering. These methods achieve remarkable reconstruction quality but suffer from a fundamental limitation: rendering requires hundreds of MLP evaluations per ray, making real-time inference impractical. The question that motivates this chapter is: can we represent RF scenes with an explicit, GPU-friendly primitive that enables both fast training and real-time rendering?
The answer comes from 3D Gaussian Splatting (3DGS), introduced by Kerbl et al. at SIGGRAPH 2023. Instead of encoding the scene in network weights, 3DGS uses a collection of anisotropic 3D Gaussians as explicit primitives. Each Gaussian carries its own position, shape, opacity, and appearance — and rendering reduces to projecting ("splatting") these Gaussians onto the image plane via standard GPU rasterisation pipelines.
Definition: 3D Gaussian Primitive
3D Gaussian Primitive
A 3D Gaussian primitive is defined by the tuple where:
- is the centre (mean position),
- is a positive-definite covariance matrix encoding shape and orientation,
- is the opacity,
- is a feature vector encoding appearance (colour, or in the RF setting, scattering attributes).
The spatial influence of the primitive is given by the 3D Gaussian density:
The covariance is parameterised as with rotation matrix (stored as a unit quaternion) and diagonal scale matrix .
Definition: 3D Gaussian Splatting Scene Representation
3D Gaussian Splatting Scene Representation
A 3DGS scene consists of Gaussian primitives:
The scene is fully characterised by the parameter set , where is the unit quaternion encoding the rotation and encodes the scale. Typical scenes use to Gaussians for high-quality reconstruction.
Definition: Differentiable Rasterisation (Splatting)
Differentiable Rasterisation (Splatting)
Differentiable rasterisation renders an image by projecting each 3D Gaussian onto the image plane and compositing in depth order. Given a camera with world-to-camera transform and projection Jacobian , the 3D covariance projects to a 2D covariance:
The rendered value at pixel is:
where is the 2D Gaussian evaluated at , the product runs over Gaussians closer to the camera (front-to-back ordering), and is the set of Gaussians whose projected footprint overlaps .
The entire pipeline is differentiable: gradients of a photometric loss flow back through the compositing and projection to update every Gaussian parameter .
Theorem: Alpha Compositing as Discretised Volume Rendering
The front-to-back alpha compositing of 3DGS:
is the Riemann-sum discretisation of the NeRF volume rendering integral (Chapter 24):
where each Gaussian's contribution corresponds to the opacity at the -th sample along the ray.
Both NeRF and 3DGS solve the same rendering problem. NeRF samples the density field along each ray and integrates; 3DGS projects analytic Gaussians onto the image plane and sums. The connection is that splatting a Gaussian is equivalent to evaluating the volume rendering integral analytically for a Gaussian density profile.
Volume rendering discretisation
Partition the ray into intervals centred at the Gaussian means. In each interval, approximate the density as constant: for with .
Transmittance factorisation
The discrete transmittance becomes where for small .
Identification with splatting
Identifying (the product of learnable opacity and projected Gaussian evaluation) and yields the 3DGS compositing formula.
Definition: Adaptive Density Control
Adaptive Density Control
3DGS uses adaptive density control to add, remove, and split Gaussians during training:
- Densification by cloning: Gaussians in under-reconstructed regions (high positional gradient ) are cloned — duplicated with a small offset.
- Densification by splitting: Large Gaussians covering too much area (scale ) are split into two smaller Gaussians.
- Pruning: Gaussians with opacity below a threshold are removed.
- Opacity reset: Periodically, all opacities are reduced to encourage removing unnecessary Gaussians.
This adaptive scheme is critical: it allows the representation to allocate resolution where the scene is complex and remain sparse elsewhere.
3DGS Training Pipeline
Complexity: per iteration, where is the number of Gaussians and is the number of pixels. Tile-based rasterisation reduces this to where is the number of tiles and the average Gaussians per tile.The training is an order of magnitude faster than NeRF because rendering uses rasterisation (forward pass through sorted tiles) rather than ray marching (hundreds of MLP queries per ray).
Example: 3DGS vs NeRF --- Rendering Speed Comparison
Compare the rendering speed and training time of 3DGS with the original NeRF and Instant-NGP on a standard benchmark (e.g., Mip-NeRF 360 dataset at p resolution).
Rendering speed
- NeRF (original): FPS. Each pixel requires MLP evaluations along the ray.
- Instant-NGP: FPS using hash-grid encoding and a small MLP.
- 3DGS: FPS at p. The tile-based rasteriser processes all Gaussians in parallel without per-ray MLP queries.
The speedup factor is over the original NeRF.
Training time
- NeRF: -- hours on a single GPU.
- Instant-NGP: minutes.
- 3DGS: -- minutes for M Gaussians, competitive with Instant-NGP while achieving higher quality.
Quality
On the Mip-NeRF 360 benchmark, 3DGS achieves PSNR dB, comparable to the best NeRF variants. 3DGS handles thin structures and sharp edges better (each Gaussian is independently oriented), while NeRF excels at smooth translucent media.
Historical Note: From EWA Splatting to 3DGS
2000--2023Point-based rendering and splatting have a long history in computer graphics. Zwicker et al. (2001) introduced EWA (elliptical weighted average) splatting, which projects 3D ellipsoids onto the image plane as 2D ellipses --- essentially the same geometric operation that 3DGS uses. Pfister et al. (2000) developed surfels (surface elements) as oriented disc primitives. What Kerbl et al. (2023) contributed was the combination of (1) differentiable rasterisation enabling gradient-based optimisation, (2) adaptive density control for automatic resolution allocation, and (3) spherical harmonics for view-dependent appearance --- turning a rendering primitive into a learnable scene representation.
The speed advantage of splatting over ray marching was well known in the graphics community. The insight of 3DGS was that this speed advantage could be combined with analysis-through-synthesis optimisation to create an explicit scene representation that rivals the quality of implicit neural representations.
2D Gaussian Splatting Demonstration
Visualise how a collection of 2D Gaussians renders an image through alpha compositing. Adjust the number of Gaussians, their scale, and opacity to see how the representation quality changes.
Parameters
Quick Check
In 3D Gaussian Splatting, the covariance matrix is parameterised as . Why is this parameterisation preferred over directly optimising ?
It reduces the number of free parameters from 6 to 3
It guarantees that remains positive semi-definite during optimisation
It allows faster matrix inversion
It enables use of spherical harmonics for view dependence
Any matrix of the form is automatically positive semi-definite, since . Directly optimising could produce non-PSD matrices, which have no interpretation as a covariance.
Common Mistake: 3DGS Is Not True Volumetric Rendering
Mistake:
Assuming that 3DGS performs exact volumetric integration along each ray, equivalent to NeRF.
Correction:
3DGS uses a rasterisation-based approximation: each Gaussian is projected to 2D and composited in depth order. This is a Riemann-sum approximation to the volume rendering integral, not an exact integration. The approximation can produce artifacts when Gaussians overlap significantly in depth or when the scene has complex occlusion patterns. For RF applications, where the "scene" is a collection of scatterers rather than a continuous density field, this approximation is often acceptable.
Splatting
A rendering technique where 3D primitives (typically ellipsoids or Gaussians) are projected ("splatted") onto a 2D image plane. Each primitive contributes a weighted footprint to the image, and overlapping contributions are composited. Splatting is the dual of ray casting: instead of shooting rays through pixels and querying the scene, the scene projects itself onto the image.
Related: Differentiable Rendering
Differentiable Rendering
A rendering pipeline designed so that the gradient of a loss function (comparing rendered and observed images) can be computed with respect to all scene parameters via backpropagation. This enables gradient-based optimisation of scene geometry, appearance, and camera parameters from image observations alone.
Related: Splatting, Analysis Through Synthesis
Analysis Through Synthesis
An inverse-problem strategy where scene parameters are estimated by synthesising (rendering) observations from a parameterised model and comparing them to actual measurements. The parameters are then updated to minimise the discrepancy. In the 3DGS context, the Gaussian parameters are the "scene model" and differentiable rasterisation is the "synthesis" step.
Related: Differentiable Rendering
Key Takeaway
3D Gaussian Splatting represents scenes as explicit collections of anisotropic 3D Gaussians, each carrying position, shape, opacity, and appearance attributes. Differentiable rasterisation enables gradient-based optimisation from multi-view images, achieving training times of minutes and rendering at FPS --- orders of magnitude faster than NeRF. The alpha compositing formula is a discretisation of the same volume rendering integral used by NeRF, connecting the two representations theoretically.