Comparison and Open Questions

Choosing a Neural Scene Representation for RF

Chapters 24--26 have presented three families of neural scene representations: signed distance functions (SDF, Chapter 24), neural radiance fields (NeRF, Chapter 24), and 3D Gaussian Splatting (this chapter). Each has been adapted for RF applications. The question a practitioner faces is: which representation should I use for my RF imaging problem? The answer depends on the specific requirements --- speed, accuracy, data availability, interpretability, and scalability. This section provides a systematic comparison.

NeRF vs 3DGS vs SDF for RF Scene Reconstruction

Property	NeRF/RF-NeRF	3DGS/RF-3DGS	SDF/GeRaF
Representation	Implicit (MLP weights)	Explicit ( $N$ Gaussians)	Implicit (MLP $\to$ distance)
Rendering	Ray marching ( $\sim 200$ samples/ray)	Rasterisation (tile-based)	Sphere tracing + volumetric
Training time	$5$ min-- $24$ h	$15$ -- $30$ min	$30$ min-- $2$ h
Rendering speed	$0.03$ -- $10$ FPS	$> 100$ FPS	$1$ -- $10$ FPS
PSNR (optical)	$\sim 32$ dB	$\sim 33$ dB	$\sim 30$ dB
RF power MAE	$\sim 4$ dB	$\sim 4$ dB	$\sim 5$ dB
Memory	Low (MLP weights only)	Medium ( $\sim 200$ B/Gaussian)	Low (MLP weights only)
Thin structures	Poor (volume rendering blur)	Good (oriented Gaussians)	Good (surface-based)
Smooth media	Excellent (continuous density)	Adequate (many small Gaussians)	Poor (surfaces only)
Interpretability	Low (black-box MLP)	High (each Gaussian is a scatterer)	Medium (surface geometry)
Editability	Difficult	Easy (move/add/remove Gaussians)	Moderate (deform surface)
Dynamic scenes	Requires D-NeRF extensions	Natural (per-Gaussian velocity)	Difficult

, ,

Definition:
Taxonomy of RF Scene Representations

We classify RF scene representations along two axes:

Axis 1 --- Implicit vs Explicit:

Implicit: The scene is encoded in neural network weights. Querying a point requires a forward pass through the network. Examples: NeRF, SDF networks, DeepSDF.
Explicit: The scene is represented by a set of primitives with directly interpretable parameters. Querying a point requires evaluating the primitives. Examples: 3DGS, voxel grids, point clouds.

Axis 2 --- Volumetric vs Surface-based:

Volumetric: The representation assigns attributes (density, colour, RF power) to every point in 3D space. Examples: NeRF, voxel grids, 3DGS.
Surface-based: The representation defines a 2D surface embedded in 3D space. Examples: SDF, meshes, surfels.

For RF imaging, the choice depends on the scattering model: specular-dominant scenarios (metallic environments, mmWave) favour surface-based representations, while diffuse-dominant scenarios (sub-6 GHz indoor, vegetation) favour volumetric representations.

Example: Choosing the Right Representation

For each of the following RF imaging scenarios, recommend the most appropriate neural scene representation and justify the choice.

(a) Real-time digital twin of a factory floor with moving robots. (b) High-resolution imaging of a metallic aircraft fuselage. (c) Indoor coverage prediction with 50 power measurements and 1000 camera images. (d) Large-scale outdoor urban propagation modelling.

Solution

(a) Factory digital twin

Recommendation: 3DGS (RF-3DGS/RadarSplat).

Justification: Real-time rendering ( $> 100$ FPS) is required for the digital twin. The factory contains a mix of specular (metal shelves, machinery) and diffuse (cardboard, wood) scatterers, which Gaussians handle well. Moving robots can be modelled as dynamic Gaussian clusters with per-primitive velocity, which is natural for the explicit 3DGS representation.

(b) Metallic aircraft

Recommendation: SDF (GeRaF-style).

Justification: The aircraft fuselage has a smooth, well-defined surface with predominantly specular scattering. An SDF representation captures the surface geometry precisely, enabling accurate specular reflection computation. The smooth surface is poorly represented by many small Gaussians (the 3DGS approach) and does not benefit from volumetric density (the NeRF approach).

(c) Indoor coverage with visual data

Recommendation: RFCanvas (3DGS + visual prior).

Justification: With abundant camera images and sparse RF measurements, RFCanvas's multi-modal fusion is ideal. The visual 3DGS provides the geometric skeleton, and only the RF attributes need to be learned from the 50 measurements. A pure RF-3DGS or NeRF approach would require $\sim 200+$ RF measurements for comparable accuracy.

(d) Large-scale urban

Recommendation: Hybrid --- 3DGS for local scenes + ray tracing for macro propagation.

Justification: No single neural representation scales efficiently to city-scale environments ( $> 1$ km $^2$ ). The practical approach is to divide the environment into local tiles ( $\sim 100 \times 100$ m), represent each with a compact 3DGS model, and use a macro-level ray tracer for inter-tile propagation. This hybrid approach balances the fine-grained modelling of local scattering with the scalability needed for urban coverage.

, ,

Theorem: Representation Complexity for RF Scenes

Consider an RF scene in a bounded region $\ntn{tgt_region} \subset \mathbb{R}^3$ with $S$ distinct scatterers, each with angular complexity bounded by spherical harmonic order $L$ . The minimum number of parameters required for an $\epsilon$ -accurate power prediction is:

Representation	Parameters
Voxel grid	$O((\text{diam}(\ntn{tgt_region})/\epsilon_{\text{spatial}})^3)$
NeRF (MLP)	$O(S \cdot (L+1)^2)$
3DGS	$O(S \cdot (10 + (L+1)^2))$
SDF (MLP)	$O(S)$ for geometry $+\;O(S \cdot (L+1)^2)$ for reflectance

The key insight is that 3DGS scales with the number of scatterers $S$ , not the volume resolution. For sparse scenes (few scatterers in a large volume), 3DGS is far more efficient than voxel grids. For dense scenes (many small scatterers), the voxel grid or NeRF may be more compact.

3DGS allocates parameters where the scene has structure (at the scatterers) and uses zero parameters in empty space. Voxel grids allocate parameters uniformly in space. NeRFs encode the scene compactly in network weights but require expensive per-point evaluation.

Proof

Voxel grid

A voxel grid with spatial resolution $\epsilon_{\text{spatial}}$ requires $(V/\epsilon_{\text{spatial}}^3)$ voxels, each storing $O(1)$ parameters. This is independent of scene sparsity.

3DGS

Each Gaussian requires $3 + 4 + 3 = 10$ geometric parameters (position, quaternion, scale) plus $(L+1)^2$ SH coefficients for the directional RF response. With $S$ scatterers, the total is $O(S \cdot (10 + (L+1)^2))$ .

Scaling comparison

For a $10 \times 10 \times 3$ m room at $\epsilon_{\text{spatial}} = 0.1$ m: voxel grid $= 3 \times 10^5$ voxels. For $S = 500$ scatterers with $L = 2$ : 3DGS $\approx 500 \times 19 = 9{,}500$ parameters --- a $30\times$ reduction. $\blacksquare$

NeRF vs 3DGS vs SDF Performance Comparison

Compare the three representations across speed, accuracy, and compactness as a function of scene complexity (number of scatterers).

Parameters

Number of scatterers

S

100

Metric

Scene size (m)30

Common Mistake: Overfitting with Adaptive Density Control

Mistake:

Allowing unrestricted Gaussian densification when RF measurements are sparse ( $M < 100$ ), leading to overfitting where Gaussians cluster around measurement locations.

Correction:

With sparse RF data, adaptive density control must be constrained:

Cap the total number of Gaussians: $N_{\max} \leq c \cdot M$ where $c \sim 5$ -- $10$ . This prevents the model from allocating more degrees of freedom than the data can constrain.
Increase the pruning threshold: Use $\epsilon_\alpha = 0.05$ instead of $0.005$ .
Add regularisation: Total-variation or Laplacian smoothness penalty on the power map to prevent spatially discontinuous predictions.
Use visual priors (RFCanvas approach) to constrain geometry when available.

Open Research Questions

Several fundamental questions remain open for Gaussian Splatting in RF imaging:

Coherent vs incoherent rendering. Current RF-3DGS methods primarily predict power (incoherent). Extending to full complex channel prediction requires tracking phase through the splatting pipeline, which is complicated by the non-differentiability of phase wrapping.
Dynamic environments. RF environments change as people move, doors open, and vehicles pass. Dynamic 3DGS (D-3DGS) exists for optics but has not been adapted for RF. The challenge is that RF changes are caused by objects at wavelength scale, not just geometric deformations.
Scalability. Current methods work for single rooms or short road segments. Scaling to building-level or city-level requires hierarchical or tiled representations with efficient inter-tile communication.
Theoretical guarantees. Unlike classical estimators (LASSO, OAMP) for which recovery guarantees exist (Chapters 14, 17), no analogous theory exists for Gaussian splatting reconstruction. When does the optimisation converge? How many measurements are sufficient?
Integration with the forward model. Can the Gaussian representation be integrated directly into the unified forward model $\mathbf{y} = \mathbf{A}\mathbf{c} + \mathbf{w}$ of Chapter 7, replacing the voxel grid? What are the computational and statistical implications?

Historical Note: Timeline of Neural Scene Representations for RF

2020--2024

The application of neural scene representations to RF is a very recent development:

2020: Mildenhall et al. introduce NeRF for novel view synthesis in computer vision.
2022: Zhao et al. adapt NeRF for RF propagation prediction (NeRF2, later RF-NeRF), demonstrating that implicit neural representations can model radio environments.
2023: Kerbl et al. introduce 3D Gaussian Splatting at SIGGRAPH, achieving real-time rendering quality comparable to NeRF.
2024: Zhang et al. (RF-3DGS), Chen et al. (RFCanvas), and Niedermayr et al. (RadarSplat) simultaneously adapt 3DGS for RF power prediction, multi-modal RF mapping, and automotive radar.
2024: Dong et al. (GSpaRC) add physics-based propagation to Gaussian splatting for automotive radar.

The field has moved from concept to multiple practical systems in under two years, driven by the demand for fast, accurate RF environment models for 5G/6G network planning and autonomous driving.

, , , ,

Quick Check

For an indoor mmWave (28 GHz) environment with mostly metallic furniture and glass partitions, which representation would you expect to give the best reconstruction quality?

NeRF (RF-NeRF) because it handles complex volumetric scattering

3DGS because it offers real-time rendering

SDF (GeRaF) because specular scattering at mmWave is surface-dominated

Voxel grid because it requires no training

Correction:

SDF (GeRaF) because specular scattering at mmWave is surface-dominated

At mmWave frequencies, the wavelength ( $\\sim 1$ cm) is much smaller than most surfaces, making scattering predominantly specular. An SDF representation captures the surface geometry precisely, enabling accurate specular reflection computation. Metallic and glass surfaces are well-modelled as smooth surfaces with high reflectivity.

⚠️Engineering Note

Deployment Considerations for Neural RF Representations

Deploying neural scene representations for RF in production systems requires addressing several practical concerns:

Model update frequency. Indoor environments change on the scale of minutes (people, furniture). Outdoor urban environments change on the scale of hours (traffic, weather). The model must be retrained or fine-tuned at these timescales. 3DGS is advantageous here: adding/removing a few Gaussians is faster than retraining an entire NeRF.
Computational resources. 3DGS rendering requires a GPU. Edge deployment on base stations or vehicles may need inference on embedded GPUs (Jetson, mobile SoCs). Reducing the number of Gaussians (via GSpaRC sparsity) is critical for edge deployment.
Standardisation. There is no standard format for neural RF scene representations. Interoperability between tools (Sionna, WinProp, RF-3DGS) requires either standardised export formats or adapter layers. The Open3D and PLY formats used by optical 3DGS are a starting point.

Practical Constraints

•
GPU required for real-time rendering (NVIDIA RTX or equivalent)
•
Model retraining needed when environment changes significantly
•
No standardised format for RF scene interchange

Explicit Scene Representation

A scene representation where the scene parameters (positions, shapes, attributes) are stored directly and can be read, modified, and rendered without evaluating a neural network. 3D Gaussian Splatting is an explicit representation; NeRF and SDF networks are implicit representations.

Key Takeaway

The choice between NeRF, 3DGS, and SDF for RF imaging depends on the application: 3DGS excels in speed and interpretability, making it ideal for real-time digital twins and dynamic scenes; NeRF handles volumetric diffuse scattering well; SDF is best for specular-dominated mmWave environments. Fundamental open questions remain around coherent rendering, dynamic environments, scalability to large scenes, and theoretical reconstruction guarantees. The field is evolving rapidly, and hybrid approaches that combine strengths of multiple representations are a promising direction.

RadarSplat and Automotive Applications Chapter Summary

Comparison and Open Questions

Choosing a Neural Scene Representation for RF

NeRF vs 3DGS vs SDF for RF Scene Reconstruction

Definition: Taxonomy of RF Scene Representations

Example: Choosing the Right Representation

(a) Factory digital twin

(b) Metallic aircraft

(c) Indoor coverage with visual data

(d) Large-scale urban

Theorem: Representation Complexity for RF Scenes

Voxel grid

3DGS

Scaling comparison

NeRF vs 3DGS vs SDF Performance Comparison

Parameters

Common Mistake: Overfitting with Adaptive Density Control

Open Research Questions

Historical Note: Timeline of Neural Scene Representations for RF

Quick Check

Deployment Considerations for Neural RF Representations

Explicit Scene Representation

Key Takeaway

Definition:
Taxonomy of RF Scene Representations