Score-Based Models and Flow Matching

Definition:

Score Function and Score Matching

The score function is the gradient of the log-density:

sθ(x)xlogp(x)\mathbf{s}_\theta(\mathbf{x}) \approx \nabla_\mathbf{x} \log p(\mathbf{x})

Score matching trains sθ\mathbf{s}_\theta without knowing p(x)p(\mathbf{x}). Denoising score matching adds noise and learns:

sθ(x~,σ)x~logpσ(x~)\mathbf{s}_\theta(\tilde{\mathbf{x}}, \sigma) \approx \nabla_{\tilde{\mathbf{x}}} \log p_\sigma(\tilde{\mathbf{x}})

Definition:

Flow Matching

Flow matching learns a velocity field vθ(xt,t)\mathbf{v}_\theta(\mathbf{x}_t, t) that transports noise x0N(0,I)\mathbf{x}_0 \sim \mathcal{N}(0, I) to data x1pdata\mathbf{x}_1 \sim p_{\text{data}} along straight paths:

xt=(1t)x0+tx1\mathbf{x}_t = (1-t)\mathbf{x}_0 + t\mathbf{x}_1 L=Et,x0,x1[vθ(xt,t)(x1x0)2]L = \mathbb{E}_{t, \mathbf{x}_0, \mathbf{x}_1}\left[\|\mathbf{v}_\theta(\mathbf{x}_t, t) - (\mathbf{x}_1 - \mathbf{x}_0)\|^2\right]

Sampling: solve the ODE dx/dt=vθ(x,t)d\mathbf{x}/dt = \mathbf{v}_\theta(\mathbf{x}, t) from t=0t=0 to t=1t=1.

Flow matching is simpler to train than score-based SDEs and often requires fewer sampling steps.

Example: Flow Matching Training

Implement flow matching training for 2D data.