Physics-Informed and Equivariant Networks

Why Embed Physics in Neural Networks?

The networks we have seen so far (U-Nets, unfolded algorithms, diffusion models) learn from data alone. When data is scarce or the problem has known physical structure, we can do better by embedding the governing equations directly into the network architecture or loss function. This section covers three approaches:

  1. PINNs: Enforce PDEs (Helmholtz, Maxwell) as soft constraints in the loss.
  2. Equivariant networks: Build symmetries (rotation, translation) into the architecture.
  3. Fourier Neural Operators: Learn resolution-independent PDE solution operators in the Fourier domain.
,

Definition:

Physics-Informed Neural Network (PINN)

A Physics-Informed Neural Network trains a neural network uθ(r,t)u_\theta(\mathbf{r}, t) to approximate the solution of a PDE by incorporating the PDE residual into the loss:

L(θ)=1Ndi=1Nduθ(ri,ti)uidata2data loss+λPDENcj=1NcF[uθ](rj,tj)2PDE residual loss,\mathcal{L}(\theta) = \underbrace{\frac{1}{N_d}\sum_{i=1}^{N_d} |u_\theta(\mathbf{r}_i, t_i) - u_i^{\text{data}}|^2}_{\text{data loss}} + \underbrace{\frac{\lambda_{\text{PDE}}}{N_c}\sum_{j=1}^{N_c} |\mathcal{F}[u_\theta](\mathbf{r}_j, t_j)|^2}_{\text{PDE residual loss}},

where F[u]=0\mathcal{F}[u] = 0 is the PDE (e.g., the Helmholtz equation), {(ri,ti,uidata)}\{(\mathbf{r}_i, t_i, u_i^{\text{data}})\} are observed data points, and {(rj,tj)}\{(\mathbf{r}_j, t_j)\} are collocation points where the PDE is enforced.

The PDE residual is computed using automatic differentiation: spatial and temporal derivatives of uθu_\theta are exact (not finite-difference approximations).

PINNs are mesh-free: no discretisation grid is needed. The collocation points can be sampled adaptively, concentrating in regions where the residual is large.

Definition:

PINN for the Helmholtz Equation

For time-harmonic RF wave propagation, the governing equation is the Helmholtz equation:

2u(r)+κ2ϵr(r)u(r)=s(r),\nabla^2 u(\mathbf{r}) + \kappa^{2}\,\epsilon_r(\mathbf{r})\,u(\mathbf{r}) = -s(\mathbf{r}),

where uu is the total field, κ=2π/λ\kappa = 2\pi/\lambda is the wavenumber, ϵr\epsilon_r is the relative permittivity, and ss is the source term.

A PINN for the Helmholtz equation parameterises uθ(r)u_\theta(\mathbf{r}) (complex-valued) and minimises:

L=1Ndiuθ(ri)uimeas2+λNcj2uθ(rj)+κ2ϵr(rj)uθ(rj)+s(rj)2.\mathcal{L} = \frac{1}{N_d}\sum_i |u_\theta(\mathbf{r}_i) - u_i^{\text{meas}}|^2 + \frac{\lambda}{N_c}\sum_j |\nabla^2 u_\theta(\mathbf{r}_j) + \kappa^{2} \epsilon_r(\mathbf{r}_j)\,u_\theta(\mathbf{r}_j) + s(\mathbf{r}_j)|^2.

For the inverse problem (unknown ϵr\epsilon_r), both uθu_\theta and ϵr\epsilon_r (or a neural parameterisation thereof) are jointly optimised.

Example: PINN for 2D Inverse Scattering

Set up a PINN to recover the permittivity distribution ϵr(r)\epsilon_r(\mathbf{r}) of an unknown object from scattered field measurements at Nd=50N_d = 50 receiver locations, using Ninc=8N_{\text{inc}} = 8 incident plane waves at frequency f=3f = 3 GHz.

Common Mistake: PINN Spectral Bias

Mistake:

Using a standard MLP (without positional encoding or Fourier features) for a PINN solving a high-frequency Helmholtz equation, and finding the network learns only the low-frequency components.

Correction:

Standard MLPs suffer from spectral bias: they learn low-frequency components much faster than high-frequency ones. For the Helmholtz equation at frequency ff, the solution oscillates at spatial scale λ=c/f\lambda = c/f; if λ\lambda is small relative to the domain, the MLP cannot represent the oscillations.

Solutions:

  • Fourier feature encoding: γ(r)=[sin(Br),cos(Br)]\gamma(\mathbf{r}) = [\sin(\mathbf{B}\mathbf{r}), \cos(\mathbf{B}\mathbf{r})] with B\mathbf{B} sampled from N(0,σ2)\mathcal{N}(0, \sigma^2), where σ1/λ\sigma \propto 1/\lambda.
  • Multi-scale architecture: Separate networks for different frequency bands.
  • Curriculum training: Start with low frequency and gradually increase.

Definition:

Equivariance and Equivariant Networks

A function f ⁣:XYf \colon \mathcal{X} \to \mathcal{Y} is equivariant to a group GG of transformations if applying TGT \in G to the input produces a corresponding transformation ρT\rho_T of the output:

f(Tx)=ρTf(x)for all TG.f(T \cdot \mathbf{x}) = \rho_T \cdot f(\mathbf{x}) \quad \text{for all } T \in G.

Special case (invariance): If ρT=id\rho_T = \text{id} for all TT.

Equivariant neural networks build symmetries into the architecture:

Symmetry Group Transformation Architecture
Z2\mathbb{Z}^2 Spatial shifts Standard CNN
SO(2)SO(2) 2D rotation Steerable CNN
SO(3)SO(3) 3D rotation Spherical CNN
E(3)E(3) Euclidean (rotation + translation) EGNN, PaiNN

The key idea: replace the standard convolution kernel (defined on R2\mathbb{R}^2) with a kernel on the group GG, ensuring that the output transforms predictably under group actions.

For RF imaging, the relevant symmetries include rotation equivariance for SAR (the scene reflectivity should not depend on imaging geometry orientation), and translation equivariance (standard CNNs).

Definition:

Fourier Neural Operator (FNO)

The Fourier Neural Operator learns a mapping between function spaces Gθ ⁣:a(r)u(r)\mathcal{G}_\theta \colon a(\mathbf{r}) \mapsto u(\mathbf{r}) by parameterising the integral kernel in the Fourier domain:

(Kθv)(r)=F1 ⁣[RθF[v]](r),(\mathcal{K}_\theta v)(\mathbf{r}) = \mathcal{F}^{-1}\!\left[\mathbf{R}_\theta \cdot \mathcal{F}[v]\right](\mathbf{r}),

where F\mathcal{F} is the Fourier transform and RθCdv×dv×kmax\mathbf{R}_\theta \in \mathbb{C}^{d_v \times d_v \times k_{\max}} is a learnable weight tensor applied to the first kmaxk_{\max} Fourier modes. The full FNO layer is:

v(+1)(r)=σ ⁣(W()v()(r)+(Kθ()v())(r)).v^{(\ell+1)}(\mathbf{r}) = \sigma\!\left(\mathbf{W}^{(\ell)} v^{(\ell)}(\mathbf{r}) + (\mathcal{K}_\theta^{(\ell)} v^{(\ell)})(\mathbf{r})\right).

Cost: O(NlogN)O(N \log N) per layer (FFT). Resolution independence: Fourier modes are truncated at kmaxk_{\max}, independent of discretisation NN.

FNO is the neural-operator analog of a spectral method in numerical PDEs. For the Helmholtz equation, where convolution with the Green's function is naturally expressed in the Fourier domain, FNO is a particularly natural architecture.

Theorem: Universal Approximation for Neural Operators

Let G ⁣:AU\mathcal{G}^\dagger \colon \mathcal{A} \to \mathcal{U} be a continuous operator between Banach spaces of functions on a bounded domain Ω\Omega. For any ϵ>0\epsilon > 0, there exists a neural operator Gθ\mathcal{G}_\theta with finitely many parameters such that:

supaKGθ(a)G(a)U<ϵ,\sup_{a \in K} \|\mathcal{G}_\theta(a) - \mathcal{G}^\dagger(a)\|_{\mathcal{U}} < \epsilon,

for any compact set KAK \subset \mathcal{A}.

Specifically, for the FNO with LL layers and kmaxk_{\max} Fourier modes, the approximation error scales as O(kmaxs)O(k_{\max}^{-s}) for operators G\mathcal{G}^\dagger with Sobolev regularity ss.

Just as standard neural networks are universal approximators for functions (vectors \to vectors), neural operators are universal approximators for operators (functions \to functions). The FNO's Fourier truncation provides spectral convergence for smooth operators.

FNO vs. CNN Convergence on Helmholtz Inverse Problem

Compare the test error of FNO and a standard U-Net CNN on the Helmholtz inverse problem (permittivity \to scattered field) as a function of training set size and Fourier modes. The FNO converges faster and generalises better, especially at higher frequencies where the solution has more spatial oscillations. The FNO's resolution independence means it can be trained on a 64×6464 \times 64 grid and evaluated on 128×128128 \times 128 without retraining.

Parameters
12
500
3

Example: FNO as a Differentiable Forward Model for RF Imaging

Describe how to use an FNO to replace the full-wave solver in an iterative reconstruction algorithm for RF imaging.

Why This Matters: Physics-Informed Methods for RF Imaging

Physics-informed approaches address key challenges in RF imaging:

  • PINNs for inverse scattering: Recover ϵr(p)\epsilon_r(\mathbf{p}) from sparse scattering data by enforcing the Helmholtz equation as a soft constraint — particularly valuable when the measurement geometry is irregular and standard discretisation is awkward.

  • Equivariant networks for SAR: Steerable CNNs achieve rotation-equivariant SAR target recognition without data augmentation — the network generalises to unseen azimuth angles by construction.

  • FNO as fast forward model: Replace the O(N2)O(N^2) matrix-vector product Ac\mathbf{A}\mathbf{c} with an O(NlogN)O(N \log N) FNO evaluation, enabling real-time iterative reconstruction.

  • FNO for resolution transfer: Train on coarse grids (fast), evaluate on fine grids (detailed) — exploiting the FNO's resolution independence.

,

Quick Check

What is the key advantage of an FNO over a PINN for solving multiple instances of the Helmholtz equation with different permittivity distributions?

FNO is more accurate for a single instance

FNO learns the solution operator, amortising the cost across instances

FNO does not require training data

FNO has fewer parameters

Common Mistake: FNO Out-of-Distribution Generalisation

Mistake:

Training an FNO on simple permittivity distributions (circles, rectangles) and expecting it to generalise to complex real-world scenes (buildings, furniture) without testing.

Correction:

Neural operators, like all learned models, generalise only within the distribution of the training data. An FNO trained on simple geometries may fail dramatically on complex scenes with sharp corners, thin structures, or high-contrast interfaces.

Remedies: (i) Train on a diverse dataset that covers the expected scene complexity. (ii) Use physics-based fine-tuning: initialise with the FNO prediction, then refine with a few iterations of the physics-based solver. (iii) Hybrid approaches: FNO for the smooth part, physics solver for the high-contrast details.

Historical Note: From Galerkin to PINNs

1998--2021

The idea of using neural networks to solve PDEs dates to Lagaris et al. (1998), who used small feedforward networks with boundary condition constraints. The approach remained niche until Raissi, Perdikaris, and Karpathy (2019) demonstrated that modern deep networks with automatic differentiation could solve complex forward and inverse PDE problems — coining the term "Physics-Informed Neural Networks." The Fourier Neural Operator (Li et al., 2021) shifted the paradigm from solving individual PDE instances to learning the solution operator, enabling real-time PDE solving with spectral accuracy.

,

PINN Training for Helmholtz Inverse Scattering

Complexity: O(Nepochs(Nd+Nc)P)O(N_{\text{epochs}} \cdot (N_d + N_c) \cdot P) where PP is the network parameter count. The second-order autodiff for the Laplacian doubles the backpropagation cost.
Input: Scattered field data {(r_i, E^s_i)}, incident fields {E^i_k},
collocation points {r_j}, wavenumber kappa
Output: Permittivity map epsilon_r(r)
1. Initialise field network u_theta and permittivity network eps_phi
2. for epoch = 1, ..., N_epochs do
3. Sample mini-batch of data points and collocation points
4. for each incident wave k do
5. Compute u_theta^(k)(r_j) at collocation points
6. Compute Laplacian via autodiff: nabla^2 u_theta^(k)(r_j)
7. PDE residual: R_j = nabla^2 u + kappa^2 * eps_phi(r_j) * u + s_k(r_j)
8. end for
9. L_data = (1/N_d) sum_i |u_theta(r_i) - E^s_i|^2
10. L_pde = (lambda/N_c) sum_j |R_j|^2
11. L = L_data + L_pde
12. Update theta, phi via Adam(grad L)
13. end for
14. return eps_phi(r) for r in domain

Curriculum scheduling of λ\lambda (low initially, increasing over epochs) helps avoid getting trapped in the data-only minimum.

PINN (Physics-Informed Neural Network)

A neural network trained with a loss function that includes the residual of a governing PDE, evaluated at collocation points via automatic differentiation. Enables mesh-free PDE solving and inverse problems.

Related: Physics-Informed Neural Network (PINN)

FNO (Fourier Neural Operator)

A neural operator architecture that parameterises integral kernels in the Fourier domain, achieving O(NlogN)O(N \log N) cost and resolution-independent PDE solving.

Related: Fourier Neural Operator (FNO)

Key Takeaway

PINNs embed PDE constraints (Helmholtz, Maxwell) directly into the neural network loss, enabling mesh-free inverse scattering but suffering from spectral bias at high frequencies. Equivariant networks build physical symmetries into the architecture, improving data efficiency for SAR and 3D imaging. The Fourier Neural Operator learns the PDE solution operator in the Fourier domain, enabling real-time, resolution-independent forward modelling that can replace expensive full-wave solvers in iterative reconstruction algorithms.