RF-NeRF Variants

Beyond NeRF2^2: Specialised RF Architectures

NeRF2^2 demonstrated that neural radiance fields can model RF propagation, but it leaves several gaps: no multipath, no explicit material properties, limited to RSS, and slow to train. A burst of follow-up work in 2023--2024 addressed these gaps by specialising the architecture for specific RF modalities. This section surveys six important variants.

,

Definition:

WiNeRT: Wireless Neural Ray Tracing

WiNeRT (Orekondy et al., 2023) extends NeRF for wireless by incorporating multi-bounce ray tracing within the neural rendering framework:

  1. Primary rays are cast from Tx to Rx (as in NeRF2^2).
  2. Reflection rays are generated at high-density surfaces using learned normal vectors: dr=diβˆ’2(diβ‹…n^)n^\mathbf{d}_r = \mathbf{d}_i - 2(\mathbf{d}_i \cdot \hat{\mathbf{n}})\hat{\mathbf{n}}.
  3. Diffraction rays are added at edges detected by density gradients.

The total received signal is:

S^=βˆ‘p∈Pwpβ‹…S^pβ‹…eβˆ’j2Ο€fβ„“p/c,\hat{S} = \sum_{p \in \mathcal{P}} w_p \cdot \hat{S}_p \cdot e^{-j 2\pi f \ell_p / c},

where P\mathcal{P} is the set of traced paths, β„“p\ell_p the path length, and wpw_p a learned path weight.

WiNeRT achieves 2--3 dB lower RSS prediction error than NeRF2^2 in multipath-rich environments but increases training time by 3Γ—3\times due to the multi-bounce ray tracing.

Definition:

R-NeRF: RIS-Enabled RF Neural Fields

R-NeRF extends the RF-NeRF framework to environments with reconfigurable intelligent surfaces (RIS). The RIS is modelled as a controllable reflecting layer with element-wise phase shifts Ο•βˆˆ[0,2Ο€)NRIS\boldsymbol{\phi} \in [0, 2\pi)^{N_{\mathrm{RIS}}}:

S^RIS=βˆ‘i=1NTiΞ±i ρiβˆ‘n=1NRISejΟ•n G(xi,xnRIS) G(xnRIS,xrx),\hat{S}_{\mathrm{RIS}} = \sum_{i=1}^{N} T_i \alpha_i \, \rho_i \sum_{n=1}^{N_{\mathrm{RIS}}} e^{j\phi_n}\, G(\mathbf{x}_i, \mathbf{x}_n^{\mathrm{RIS}})\, G(\mathbf{x}_n^{\mathrm{RIS}}, \mathbf{x}_{\mathrm{rx}}),

where GG denotes the free-space Green's function. The RIS phase configuration Ο•\boldsymbol{\phi} becomes a controllable input to the neural field, enabling joint scene reconstruction and RIS optimisation.

Definition:

VoxelRF: Voxel-Based Acceleration

VoxelRF replaces the MLP with a dense or sparse voxel grid storing density Οƒ\sigma and features at each voxel. For a query point x\mathbf{x}, trilinear interpolation retrieves the local density and feature, which a small MLP (1--2 layers) decodes into the signal contribution.

Advantage: Eliminating the deep MLP evaluation at each sample reduces inference time by 10Γ—10\times--100Γ—100\times.

Disadvantage: Memory scales as O(NxNyNz)\mathcal{O}(N_x N_y N_z). For a 50Γ—50Γ—1050 \times 50 \times 10 m indoor scene at 5 cm resolution, the grid has 106Γ—200=2Γ—10810^6 \times 200 = 2 \times 10^8 voxels --- requiring pruning or sparse storage.

Definition:

NeRF-APT: NeRF for Access Point Localisation

NeRF-APT inverts the typical NeRF2^2 workflow: instead of predicting RSS at known Rx locations given known Tx positions, it localises unknown Tx positions from RSS measurements at known Rx locations.

The approach treats the Tx position xtx\mathbf{x}_{\mathrm{tx}} as a learnable parameter and jointly optimises:

min⁑θ,{xtx,k}βˆ‘k,j∣P^k,j(dB)(ΞΈ,xtx,k)βˆ’Pk,j(dB)∣2.\min_{\theta, \{\mathbf{x}_{\mathrm{tx},k}\}} \sum_{k,j} \bigl|\hat{P}_{k,j}^{(\mathrm{dB})}(\theta, \mathbf{x}_{\mathrm{tx},k}) - P_{k,j}^{(\mathrm{dB})}\bigr|^2.

The scene representation ΞΈ\theta and Tx positions are learned simultaneously, analogous to how COLMAP jointly estimates camera poses and scene structure in optical NeRF.

Definition:

DART: Doppler-Aided Radiance Transfer

DART (Huang et al., 2024) extends NeRF for radar by incorporating the Doppler dimension:

  1. Input: Radar returns y(r,ΞΈ,fD)y(r, \theta, f_D) in range--angle--Doppler space, from multiple viewpoints.

  2. Scene MLP: Predicts radar cross-section ΟƒRCS(x)\sigma_{\mathrm{RCS}}(\mathbf{x}) and velocity v(x)\mathbf{v}(\mathbf{x}) at each 3D point.

  3. Doppler rendering: The Doppler shift at point x\mathbf{x} along ray direction d^\hat{\mathbf{d}} is:

fD(x)=2v(x)β‹…d^Ξ»,f_D(\mathbf{x}) = \frac{2 \mathbf{v}(\mathbf{x}) \cdot \hat{\mathbf{d}}}{\lambda},

and the rendered radar cube is:

y^(r,ΞΈ,fD)=βˆ‘i:tiβ‰ˆrTiΞ±i σRCS,i δ ⁣(fDβˆ’fD,i).\hat{y}(r, \theta, f_D) = \sum_{i: t_i \approx r} T_i \alpha_i \, \sigma_{\mathrm{RCS},i} \, \delta\!\bigl(f_D - f_{D,i}\bigr).

DART is particularly suited for automotive radar, where targets have distinct velocities. The Doppler dimension provides discrimination absent in communication-focused RF-NeRF variants.

Definition:

ISAR-NeRF: Neural Fields for Synthetic Aperture Radar

ISAR-NeRF adapts neural fields for coherent SAR / inverse SAR imaging. The forward model maps the 3D scene reflectivity ρ(x)\rho(\mathbf{x}) to the measured phase history:

s(u,f)=∫ρ(x) exp⁑ ⁣(βˆ’j4Ο€fcβˆ₯xap(u)βˆ’xβˆ₯) dx,s(u, f) = \int \rho(\mathbf{x})\, \exp\!\bigl(-j \tfrac{4\pi f}{c} \|\mathbf{x}_{\mathrm{ap}}(u) - \mathbf{x}\|\bigr)\, d\mathbf{x},

where uu is the aperture coordinate. ISAR-NeRF parameterises ρ(x)=fθ(γ(x))\rho(\mathbf{x}) = f_\theta(\gamma(\mathbf{x})) and minimises:

L(ΞΈ)=βˆ‘u,f∣s^(u,f;ΞΈ)βˆ’smeas(u,f)∣2+Ξ»TVβˆ₯βˆ‡ΟΞΈβˆ₯1.\mathcal{L}(\theta) = \sum_{u, f} \bigl|\hat{s}(u, f; \theta) - s_{\mathrm{meas}}(u, f)\bigr|^2 + \lambda_{\mathrm{TV}} \|\nabla \rho_\theta\|_1.

The MLP's spectral bias acts as an implicit regulariser, suppressing grating-lobe artifacts in sparse-aperture regimes.

Theorem: Material-Aware RF-NeRF

By parameterising the attenuation as Ξ±(x,f)=Ξ±0(x)+Ξ±1(x)f\alpha(\mathbf{x}, f) = \alpha_0(\mathbf{x}) + \alpha_1(\mathbf{x}) f, the learned coefficients can be interpreted as material properties. Define the complex permittivity map:

Ξ΅^r(x)=Ξ΅β€²(x)βˆ’jΞ±1(x)2πΡ0,\hat{\varepsilon}_r(\mathbf{x}) = \varepsilon'(\mathbf{x}) - j\frac{\alpha_1(\mathbf{x})}{2\pi \varepsilon_0},

where Ξ΅β€²\varepsilon' is related to Ξ±0\alpha_0 via the plane-wave attenuation formula. If the learned attenuation satisfies Ξ±(x,f)β‰₯0\alpha(\mathbf{x}, f) \geq 0 for all ff in the training bandwidth, then Ξ΅^r\hat{\varepsilon}_r corresponds to a physically realisable (passive) material.

Example: Material Classification from Learned Attenuation

An RF-NeRF trained on 5 GHz Wi-Fi data learns attenuation coefficients at three locations. Classify the materials.

Location Ξ±0\alpha_0 (Np/m) Ξ±1\alpha_1 (Np/m/GHz)
A 0.1 0.005
B 2.5 0.15
C 0.8 0.04

RF-NeRF Variant Comparison

MethodInput DataMultipathSpecial FeatureTraining Time
NeRF2^2RSSNo (single ray)Foundational method∼30\sim 30 min
WiNeRTRSS/CSIYes (multi-bounce)Differentiable ray tracing∼90\sim 90 min
R-NeRFRSSPartialRIS-aware rendering∼45\sim 45 min
VoxelRFRSS/CSINo10Γ—10\times faster inference∼10\sim 10 min
NeRF-APTRSSNoJoint Tx localisation∼60\sim 60 min
DARTRange-Doppler cubesLearned correctionRCS + velocity fields∼20\sim 20 min
ISAR-NeRFPhase historyNot modelledImplicit regularisation∼15\sim 15 min
, ,

Hash Grid Encoding for Faster RF-NeRF Training

Instant-NGP's multi-resolution hash encoding, when applied to RF-NeRF, provides three specific benefits:

  1. Large spatial extent: RF scenes (rooms, buildings) span meters to tens of meters, requiring many frequency bands in positional encoding. Hash encoding handles large scenes natively.

  2. Tolerable collisions: RF propagation is smoother than optical radiance (fewer high-frequency details), so hash collisions are less damaging.

  3. Memory efficiency: Storage scales as O(T)\mathcal{O}(T) where TT is the hash table size, independent of scene volume.

Typical speedup: 10Γ—10\times--50Γ—50\times compared to MLP-based positional encoding, with comparable or better accuracy.

Historical Note: The Rapid Evolution of RF-NeRF (2023-2025)

2023-2025

The adaptation of NeRF for RF propagation happened remarkably quickly. NeRF2^2 appeared in early 2023, WiNeRT later that year, and by 2024 a dozen specialised variants existed for channel modelling, localisation, radar, and SAR. This speed mirrors the optical NeRF explosion of 2020--2022 and reflects the machine learning community's ability to rapidly port successful ideas across domains.

The key enabler was the realisation that volume rendering --- the mathematical core of NeRF --- is agnostic to the physical quantity being rendered. Replacing colour with complex RF reflectivity is a change to the output layer and loss function, not to the fundamental architecture.

, ,

Common Mistake: Ignoring Multipath in Dense Indoor Environments

Mistake:

Using a single-ray RF-NeRF (NeRF2^2 or VoxelRF) for indoor environments with significant non-line-of-sight propagation.

Correction:

In indoor environments, multipath contributes 30--50% of received power. For NLOS scenarios, use multi-bounce methods like WiNeRT or add a learned multipath correction network. Alternatively, accept the accuracy trade-off and use NeRF2^2 only for LOS- dominated scenarios (outdoor urban, corridors).

Quick Check

What additional physical quantity does DART's scene MLP predict compared to NeRF2^2?

Surface normal vectors

Object velocity v(x)\mathbf{v}(\mathbf{x})

Material permittivity

Temperature

Key Takeaway

The RF-NeRF ecosystem has diversified rapidly: WiNeRT captures multipath via multi-bounce ray tracing; R-NeRF incorporates controllable RIS elements; DART adds Doppler for radar; ISAR-NeRF enables sparse-aperture SAR reconstruction. The common thread is differentiable volume rendering with physics-specific output layers (complex reflectivity, velocity, material properties). Hash grid encoding provides 10Γ—10\times--50Γ—50\times training speedup across all variants.