Ferkans — Interactive Telecom Tutor

From NeRF to GeRaF: The Lensless Imaging Challenge

NeRF (Chapter 24) reconstructs 3D scenes from posed camera images by differentiating through volumetric rendering. The key insight is that each pixel in a camera image corresponds to a single ray through the scene --- the camera lens provides a one-to-one mapping from directions to pixels.

Radar has no lens. Each radar measurement integrates contributions from all scatterers within the antenna beam, weighted by the round-trip propagation delay and beamforming gain. This lensless imaging model means that NeRF's ray-based sampling is not directly applicable. GeRaF (Lu et al., 2024) resolves this by working with matched-filter (MF) power images instead of raw measurements: the MF concentrates energy at the correct spatial locations, producing a pseudo-image that can be rendered differentiably through the neural SDF.

,

Definition:
Matched-Filter Power Image

Given radar measurements $\mathbf{y}$ and the sensing matrix $\mathbf{A}$ , the matched-filter (MF) image is (cf. Chapter 13):

$\hat{\mathbf{c}}^{\text{BP}} = \mathbf{A}^{H} \mathbf{D}^{-1} \mathbf{y},$

where $\mathbf{D} = \text{diag}(\mathbf{A}^{H} \mathbf{A})$ normalises the columns. The MF power image is the squared magnitude:

$P_{\text{MF}}(\mathbf{p}_{q}) = |\hat{c}^{\text{BP}}_q|^2, \qquad q = 1, \ldots, Q.$

The MF power image concentrates energy at scatterer locations but has limited resolution (governed by the point spread function) and contains sidelobes. GeRaF treats $P_{\text{MF}}$ as the "observation" that the neural SDF must explain.

Definition:
GeRaF Scene Model

GeRaF represents the scene with two neural networks:

Geometry network (neural SDF): $f_\theta(\mathbf{p}) \in \mathbb{R}$ , with the surface at $\{\mathbf{p} : f_\theta(\mathbf{p}) = 0\}$ .
Reflectivity network: $\Gamma_\phi(\mathbf{p}) \in \mathbb{R}_+$ , predicting the surface reflectivity at each point.

Both networks share a positional encoding $\gamma(\mathbf{p})$ and are trained jointly. The SDF determines where scattering occurs (the surface); the reflectivity network determines how strongly the surface scatters.

Definition:
RF Volumetric Rendering for Lensless Imaging

GeRaF renders the MF power image at voxel $\mathbf{p}_{q}$ by integrating along the line from each Tx--Rx pair through $\mathbf{p}_{q}$ . The predicted MF power is:

$\hat{P}_{\text{MF}}(\mathbf{p}_{q}; \theta, \phi) = \sum_{m=1}^{N_{\text{Tx}}} \sum_{n=1}^{N_{\text{Rx}}} w_{mn}(\mathbf{p}_{q})\,\Gamma_\phi(\mathbf{p}_{q})\, \delta_\sigma(f_\theta(\mathbf{p}_{q})),$

where:

$w_{mn}(\mathbf{p}_{q})$ encodes the beamforming gain and propagation loss for the $(m,n)$ Tx--Rx pair at voxel $q$ ,
$\delta_\sigma(s) = \frac{1}{\sigma}\exp(-s^2/(2\sigma^2))$ is a smooth approximation to the Dirac delta, concentrating the scattering contribution near the surface $f_\theta = 0$ .

The parameter $\sigma$ controls the "surface thickness": as $\sigma \to 0$ , scattering concentrates on the zero level set.

The key difference from NeRF: in NeRF, each camera pixel corresponds to one ray, and density $\sigma(\mathbf{p})$ is integrated along that ray. In GeRaF, each "pixel" in the MF power image already aggregates contributions from all Tx--Rx pairs, and the rendering maps the neural SDF to this aggregated quantity.

Theorem: GeRaF Training Objective

The GeRaF training loss is:

$\mathcal{L}(\theta, \phi) = \frac{1}{Q}\sum_{q=1}^{Q} \bigl(P_{\text{MF}}(\mathbf{p}_{q}) - \hat{P}_{\text{MF}}(\mathbf{p}_{q}; \theta, \phi)\bigr)^2 + \lambda_{\text{eik}}\,\mathcal{L}_{\text{eik}}(\theta),$

where the Eikonal regularizer is

$\mathcal{L}_{\text{eik}}(\theta) = \frac{1}{|\mathcal{P}|} \sum_{\mathbf{p} \in \mathcal{P}} \bigl(\|\nabla f_\theta(\mathbf{p})\| - 1\bigr)^2,$

with $\mathcal{P}$ a set of random 3D sample points.

The first term enforces data fidelity: the rendered power must match the observed MF power. The second term enforces geometric validity: the neural network must output a valid SDF.

Without the Eikonal regulariser, the network could satisfy the data term by producing an arbitrary scalar field that happens to match the power image but has no geometric meaning. The Eikonal loss constrains the output to be a distance function, ensuring that the zero level set defines a meaningful surface.

Proof

Data fidelity term

The L2 loss between observed and rendered MF power is the standard regression loss. Since both $P_{\text{MF}}$ and $\hat{P}_{\text{MF}}$ are non-negative, this term is well-defined and differentiable with respect to $(\theta, \phi)$ via the chain rule through the rendering equation and the neural networks.

Eikonal regulariser

The Eikonal term penalises deviations of $\|\nabla f_\theta\|$ from $1$ . The gradient $\nabla f_\theta$ is computed by automatic differentiation, making the regulariser fully differentiable. The sample points $\mathcal{P}$ are drawn uniformly in the scene volume plus concentrated near the current zero level set (importance sampling).

End-to-end training

Both terms are differentiable with respect to $(\theta, \phi)$ . Training proceeds by gradient descent (Adam), alternating between updating the geometry network (SDF) and the reflectivity network, or updating both simultaneously. $\blacksquare$

GeRaF Pipeline: MF Power to SDF Reconstruction

Visualise the GeRaF reconstruction pipeline. The left panel shows the input MF power image; the middle panel shows the evolving neural SDF during training; the right panel shows the extracted zero level set. Adjust the number of training iterations and the Eikonal regularisation weight.

Parameters

Training iterations200

\lambda_{\text{eik}}

0.1

Scene

MF Power Image Resolution vs. SDF Reconstruction

Compare the MF power image (limited by the PSF resolution) with the SDF-based surface reconstruction. Observe how the neural SDF recovers sharper geometry than the MF image alone by leveraging the Eikonal constraint and the surface prior.

Parameters

Number of antennas16

\text{SNR}

(dB)20

Example: GeRaF Reconstruction of a Spherical Scatterer

A spherical scatterer of radius $R = 0.1$ m is located at the origin. A $4 \times 4$ mmWave MIMO array at distance $2$ m operates at $f_0 = 60$ GHz with $W = 4$ GHz. Compute the MF power image and show how the GeRaF loss drives the neural SDF toward the true sphere SDF.

Solution

MF power image

The matched filter concentrates energy at the sphere's location, producing a peak at the origin with width $\sim \lambda/D_{\text{aperture}}$ (cross-range) and $\sim c/(2W)$ (range), where $D_{\text{aperture}}$ is the array aperture. For a $4 \times 4$ UPA with $\lambda/2$ spacing: $D \approx 4 \times 2.5$ mm $= 10$ mm, giving cross-range resolution $\sim 5$ mm $\times 2$ m $/ 10$ mm $= 1$ m.

The MF power image $P_{\text{MF}}$ is a blurred version of the true reflectivity, with resolution far coarser than the object size.

Neural SDF initialisation

The SDF network is initialised to approximate a large sphere centred at the origin: $f_\theta^{(0)}(\mathbf{p}) \approx \|\mathbf{p}\| - R_0$ with $R_0 \gg R$ (following geometric initialisation from Atzmon and Lipman, 2020). This ensures the initial zero level set encloses the scene.

Training convergence

After $\sim 200$ iterations of Adam with learning rate $10^{-3}$ :

The data fidelity loss decreases as the rendered MF power matches the observed MF power.
The Eikonal loss stays near zero, maintaining SDF validity.
The zero level set contracts from the large initial sphere toward the true sphere of radius $R = 0.1$ m.
The reflectivity network converges to a uniform value matching the sphere's radar cross-section.

🎓CommIT Contribution(2024)

GeRaF: Geometry Reconstruction from mmWave Radar

Y. Lu, G. Caire — IEEE International Conference on Communications (ICC)

Lu and Caire introduced GeRaF, the first method to reconstruct 3D geometry from mmWave radar using neural signed distance functions. The key insight is that matched-filter power images --- readily computable from standard radar measurements --- serve as a differentiable "pseudo-image" that bridges the gap between lensless radar and lens-based camera imaging (NeRF).

GeRaF jointly estimates geometry (via a neural SDF) and surface reflectivity from multi-view radar data, demonstrating that sub-wavelength surface recovery is possible when the Eikonal constraint is enforced. This work establishes the SDF representation as the foundation for neural RF scene reconstruction, extended in Chapters 26 (Gaussian splatting) and later chapters.

rf-imagingsdfmmwaveneural-reconstruction

⚠️Engineering Note

Computational Cost of GeRaF Training

GeRaF training involves: (1) evaluating the neural SDF at sampled 3D points, (2) computing the rendered MF power via the lensless rendering equation, (3) computing the Eikonal loss via automatic differentiation of the SDF gradient, and (4) backpropagating through all three.

For a scene discretised into $Q = 128^3 \approx 2 \times 10^6$ voxels, each training iteration requires $O(Q)$ forward passes through the MLP plus $O(Q)$ gradient computations for the Eikonal term. On a single GPU (NVIDIA A100), training typically converges in $\sim 5$ minutes for simple scenes and $\sim 30$ minutes for complex indoor environments.

Practical Constraints

•
GPU memory: the Eikonal gradient computation requires storing intermediate activations
•
Batch size: limited by GPU memory; typical batch = 4096 points
•
MLP depth: deeper networks (8 layers) improve quality but slow training

Common Mistake: Omitting Eikonal Regularization in GeRaF

Mistake:

Training the GeRaF model without the Eikonal regulariser ( $\lambda_{\text{eik}} = 0$ ), expecting the data fidelity term alone to produce a valid SDF.

Correction:

Without Eikonal regularisation, the geometry network produces an arbitrary scalar field that fits the MF power but has no distance-function structure. The zero level set becomes noisy and disconnected, with self-intersecting surfaces. Even a small regularisation weight ( $\lambda_{\text{eik}} = 0.01$ ) dramatically improves surface quality.

Why This Matters: Matched-Filter Power Images in ISAC Systems

In an ISAC system (Chapter 29), the communication waveform doubles as a sensing waveform. The matched-filter power image can be computed as a byproduct of channel estimation, with no additional pilot overhead. GeRaF's ability to reconstruct 3D geometry from these MF power images means that ISAC systems can provide environment maps --- enabling beam prediction, blockage forecasting, and handover optimisation --- using only the standard communication signalling.

See full treatment in ISAC Fundamentals

Key Takeaway

GeRaF bridges the gap between NeRF (camera-based) and radar imaging by using matched-filter power images as a differentiable pseudo-observation. Joint optimisation of a neural SDF (geometry) and a reflectivity network, regularised by the Eikonal constraint, recovers 3D surfaces from multi-view mmWave radar data.

Sphere Tracing on an SDF for RF Reconstruction

The sphere-tracing algorithm marches along a ray, stepping by the SDF value at each point. Circles show the guaranteed empty balls: each step is the largest safe advance without crossing the surface. This is the core rendering primitive in GeRaF — converting an SDF into rendered power images that are compared against matched-filter observations during end-to-end training.

GeRaF: Geometry from mmWave Radar

From NeRF to GeRaF: The Lensless Imaging Challenge

Definition: Matched-Filter Power Image

Definition: GeRaF Scene Model

Definition: RF Volumetric Rendering for Lensless Imaging

Theorem: GeRaF Training Objective

Data fidelity term

Eikonal regulariser

End-to-end training

GeRaF Pipeline: MF Power to SDF Reconstruction

Parameters

MF Power Image Resolution vs. SDF Reconstruction

Parameters

Example: GeRaF Reconstruction of a Spherical Scatterer

MF power image

Neural SDF initialisation

Training convergence

GeRaF: Geometry Reconstruction from mmWave Radar

Computational Cost of GeRaF Training

Common Mistake: Omitting Eikonal Regularization in GeRaF

Why This Matters: Matched-Filter Power Images in ISAC Systems

Key Takeaway

Sphere Tracing on an SDF for RF Reconstruction

Definition:
Matched-Filter Power Image

Definition:
GeRaF Scene Model

Definition:
RF Volumetric Rendering for Lensless Imaging