Ferkans — Interactive Telecom Tutor

From Snapshot to Track

§§1-3 treated a single frame of MIMO-OTFS-ISAC: estimate the target scene once, design the beamformer, run the comms and sensing tasks in parallel. But the scene evolves: vehicles move, pedestrians cross, new scatterers appear. This section lifts the snapshot analysis to the tracking problem — estimating target trajectories over frames, exploiting their continuity in the DD-angle domain. The DD representation is especially convenient for tracking because each target is a point in the DD plane whose coordinates change smoothly frame to frame.

Definition:
Target State Model

At frame $t$ , target $i$ has state $\mathbf{s}_i^{(t)} \;=\; (R_i^{(t)}, v_i^{(t)}, \theta_i^{(t)}, \dot\theta_i^{(t)}, a_i^{(t)}) \;\in\; \mathbb{R}^4 \times \mathbb{C}.$ — range, radial velocity, angle, angular velocity, complex reflectivity.

State evolution (linear constant-velocity model): $\mathbf{s}_i^{(t+1)} \;=\; \mathbf{A}\, \mathbf{s}_i^{(t)} \,+\, \mathbf{u}_i^{(t)}, \qquad \mathbf{A} = \begin{pmatrix}1 & T_{\text{fr}} & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & T_{\text{fr}} \\ 0 & 0 & 0 & 1\end{pmatrix}$ with frame duration $T_{\text{fr}}$ and process noise $\mathbf{u}$ .

Observation model (from MIMO-OTFS-ISAC): $\mathbf{z}_i^{(t)} \;=\; h(\mathbf{s}_i^{(t)}) \,+\, \mathbf{v}_i^{(t)}, \qquad h : \mathbf{s} \mapsto (\tau, \nu, \theta)$ where $h$ maps the state to the observation (delay, Doppler, angle), and $\mathbf{v}$ is the estimation error with covariance given by the CRB.

,

Theorem: Extended Kalman Tracking on the DD-Angle Grid

For a target with linear state evolution and nonlinear observation (the $(\tau, \nu, \theta)$ mapping is nonlinear in $R, v, \theta$ ), the extended Kalman filter (EKF) tracks the target state with covariance $\mathbf{P}^{(t|t)} \;=\; (\mathbf{I} - \mathbf{K}^{(t)} \mathbf{H}^{(t)}) \mathbf{P}^{(t|t-1)},$ where $\mathbf{K}^{(t)}$ is the Kalman gain, $\mathbf{H}^{(t)} = \partial h/\partial \mathbf{s}|_{\hat{\mathbf{s}}^{(t|t-1)}}$ is the observation Jacobian, and $\mathbf{P}^{(t|t-1)}$ is the predicted covariance.

Under steady-state tracking with process noise $\mathbf{Q}$ and observation noise $\mathbf{R} = \mathrm{CRB}(\mathbf{R}_x)$ , the steady-state filter MSE is $\mathrm{MSE}_{\infty} \;\sim\; \mathbf{Q}^{1/2} (\mathbf{R}_x)^{-1/4} \mathbf{Q}^{1/2}.$ Consequence. Sensing-optimal beamforming ( $\mathbf{R}_x$ illuminating target directions) reduces tracking MSE by $1/\sqrt{|\mathbf{R}_x|}$ vs. uniform illumination. This is the quantitative gain from beam-aware tracking.

Tracking a moving target is like solving a noisy linear regression — the data noise is the CRB, the process noise is how erratically the target maneuvers. Lower CRB (better sensing) compounds over time via the Kalman update, giving a multiplicative improvement in steady-state MSE. This is why even a small sensing gain per frame matters: it compounds into a large tracking gain over many frames.

Proof

Innovation

Innovation $\mathbf{i}^{(t)} = \mathbf{z}^{(t)} - h(\hat{\mathbf{s}}^{(t|t-1)})$ ; its covariance is $\mathbf{S}^{(t)} = \mathbf{H}^{(t)} \mathbf{P}^{(t|t-1)} \mathbf{H}^{(t)H} + \mathbf{R}$ .

Kalman gain

$\mathbf{K}^{(t)} = \mathbf{P}^{(t|t-1)} \mathbf{H}^{(t)H} (\mathbf{S}^{(t)})^{-1}$ .

Update

State: $\hat{\mathbf{s}}^{(t|t)} = \hat{\mathbf{s}}^{(t|t-1)} + \mathbf{K}^{(t)} \mathbf{i}^{(t)}$ . Covariance: $\mathbf{P}^{(t|t)} = (\mathbf{I} - \mathbf{K}^{(t)} \mathbf{H}^{(t)}) \mathbf{P}^{(t|t-1)}$ .

Steady state

Solving the Riccati equation $\mathbf{P}_\infty = \mathbf{A} \mathbf{P}_\infty \mathbf{A}^T + \mathbf{Q} - \mathbf{A} \mathbf{P}_\infty \mathbf{H}^T (\mathbf{H} \mathbf{P}_\infty \mathbf{H}^T + \mathbf{R})^{-1} \mathbf{H} \mathbf{P}_\infty \mathbf{A}^T$ yields the scaling above when $\mathbf{A}$ is identity (stationary state) and $\mathbf{R}$ scales as $1/|\mathbf{R}_x|$ . $\blacksquare$

Multi-Target EKF on the DD-Angle Grid

Input: DD-angle observations Z^{(t)} = {ẑ_1, ..., ẑ_P} at frame t

Existing tracks T^{(t-1)} = {s_1, ..., s_{T_{t-1}}}

Gating radius γ, birth threshold π_b, death threshold π_d

Output: Updated tracks T^{(t)}

1. PREDICT:

For each track s_i ∈ T^{(t-1)}:

s_i^{(t|t-1)} = A s_i^{(t-1)}

P_i^{(t|t-1)} = A P_i^{(t-1)} A^T + Q

2. ASSOCIATE (JPDA or Hungarian):

Cost matrix C[i, j] = ||ẑ_j - h(s_i^{(t|t-1)})||²_Σ

If C[i, j] < γ: candidate association

Solve linear-assignment to get (i, j(i)) pairings

3. UPDATE (per associated track):

Innovation i_i = ẑ_{j(i)} - h(s_i^{(t|t-1)})

K_i = P_i^{(t|t-1)} H^T (H P_i^{(t|t-1)} H^T + CRB)^{-1}

s_i^{(t|t)} = s_i^{(t|t-1)} + K_i · i_i

P_i^{(t|t)} = (I - K_i H) P_i^{(t|t-1)}

4. BIRTH:

Unassociated observations above π_b: initialize new tracks

5. DEATH:

Tracks unassociated for ≥ π_d frames: remove

Return updated track set T^{(t)}.

Complexity: O(T² P² + T · MN) per frame. For T = 6 targets,

P = 20 clutter points, MN = 10⁴: ~5 × 10⁴ ops/frame.

Real-time at 100 Hz frame rate.

Example: Highway Multi-Vehicle Tracking

A roadside BS at 77 GHz tracks $T = 6$ vehicles on a highway. Frame rate $T_{\text{fr}} = 10$ ms. Vehicle speeds 60-120 km/h. Range resolution $\Delta R = 1.5$ m (from $W = 100$ MHz), velocity resolution $\Delta v = 1.3$ m/s (from $T = 10$ ms at 77 GHz).

(a) Predict tracking MSE in steady state. (b) Evaluate association reliability for two vehicles at similar range. (c) Discuss birth/death handling at highway entrances.

Solution

Steady-state MSE

Process noise: $\sigma_{a}^2 T_{\text{fr}}^2 = (1 \text{ m/s}^2)^2 (0.01)^2 = 10^{-4}$ m². Observation noise: $(\Delta R)^2 = 2.25$ m². Riccati: $\mathrm{MSE}_R^\infty \approx \sqrt{\sigma_a^2 T_{\text{fr}}^2 \cdot \Delta R^2} = 0.015$ m = 1.5 cm.

Association with close vehicles

Two vehicles at ranges 80 m, 81.5 m: separation 1.5 m > resolution 1.5 m. Resolvable per frame. Joint tracking across frames further reduces MSE — after 10 frames, $\sim 5$ cm MSE. Clear discrimination.

Birth/death at entrances

New vehicle appears at 150 m range. Unassociated observation for 2-3 frames before track initiates (confirmation window). Then tracked at 1.5 cm MSE after ~1 s.

Summary

Multi-target tracking on the DD-angle grid achieves cm-level positional accuracy with 10-ms update. Highway-scale deployments operate reliably. Algorithm: EKF + JPDA.

Steady-State Tracking MSE vs SNR

Plot the steady-state Kalman tracking MSE (position) as a function of receive SNR, comparing single-snapshot CRB (no tracking) with steady-state EKF. Sliders: frame rate, process noise, beam-aware vs uniform illumination.

Parameters

Frame rate (Hz)100

Accel noise (m/s²)1

Theorem: Predictive Beamforming Gain

Suppose the BS knows the predicted target states $\hat{\mathbf{s}}_i^{(t|t-1)}$ for frame $t$ with covariance $\mathbf{P}^{(t|t-1)}$ . Using this prediction to pre-steer the sensing beam at frame $t$ yields improvement in tracking MSE of $\frac{\mathrm{MSE}^{\text{pred}}}{\mathrm{MSE}^{\text{blind}}} \;\approx\; \frac{\mathrm{tr}(\mathbf{P}^{(t|t-1)})}{\mathrm{tr}(\mathbf{R}_{\text{uniform}})},$ where the denominator is the CRB with uniform illumination. For well-tracked targets, this ratio is $\ll 1$ — predictive beamforming provides order-of-magnitude MSE improvement vs. blind (uniform) illumination.

Once a target is being tracked, the system knows where it is likely to be at the next frame — within a beamwidth. Concentrating the sensing beam there improves observation SNR and therefore reduces tracking noise. This creates a positive feedback loop: good tracking leads to good prediction leads to focused sensing leads to better tracking. The loop is stable as long as predictions do not diverge — the topic of §5.

Proof

Beam pattern

Sensing covariance $\mathbf{R}_x^{\text{pred}} = \sum_i \mathbf{a}(\hat\theta_i^{(t|t-1)}) \mathbf{a}(\hat\theta_i^{(t|t-1)})^H$ concentrates $P_t$ energy in $T_{\text{tgt}}$ directions.

CRB improvement

$\mathrm{CRB}^{\text{pred}} \propto N_t / T_{\text{tgt}}$ whereas $\mathrm{CRB}^{\text{blind}} \propto 1$ (uniform). Gain: $\sim N_t / T_{\text{tgt}}$ , typically 4-10x.

Kalman update

Lower CRB directly reduces Kalman update noise, and the reduction propagates to steady-state via the Riccati equation. $\blacksquare$

🎓CommIT Contribution(2023)

Predictive Tracking with MIMO-OTFS-ISAC

Y. Cui, W. Yuan, G. Caire — IEEE Trans. Signal Processing

The CommIT contribution on predictive MIMO-OTFS-ISAC tracking establishes two key results: (1) the steady-state tracking MSE scales as $\sqrt{Q/R_x}$ for a Kalman-filtered target, with explicit closed-form expressions for the multi-target multi-user scenario; (2) sensing-aware beamforming (pre-steering based on predictions) reduces steady-state MSE by the beamforming gain $\sim N_t/T_{\text{tgt}}$ , a multiplicative improvement over blind illumination.

Combined with the DD-domain channel sparsity of §1, this result makes cm-level multi-target tracking feasible at highway frame rates (100 Hz). Without the DD framework, the same sensing gain would be nullified by channel estimation errors on the order of the target spacing. The DD domain's sparsity is what allows the predictive feedback loop to remain stable under realistic CSI uncertainty.

committrackingpredictive-bfmimo-otfs

Historical Note: From Classical Radar Tracking to DD-Angle EKF

Classical radar tracking (PDA, IMM, JPDA) dates to Bar-Shalom's 1970s work on multi-target estimation. Classical algorithms operate in Cartesian position-velocity space and assume a known measurement likelihood. The DD-angle framework here gives a principled prior distribution for the measurements (from the DD structure of OTFS), not an ad-hoc choice. This is the main advance: the same Kalman and JPDA machinery, but with measurement noise and innovation covariances derived from the waveform, not guessed.

In automotive applications, this integration eliminates the "sensor fusion layer" that classical designs use to reconcile radar and camera tracks — OTFS-ISAC provides both modalities simultaneously, with coherent measurement models.

Common Mistake: Don't Track Ghosts

Mistake:

Associating every observed DD-angle peak with a target track. Spurious peaks — from sidelobes of nearby targets, ground clutter, or random noise — create ghost tracks that persist if not actively pruned.

Correction:

Use confirmation windows: a track is confirmed only after 2-3 frames of consistent observations. Use track quality metrics (cumulative innovation, likelihood ratio) to terminate low-quality tracks. In high-clutter environments (urban, forest), operate with higher birth thresholds ( $\pi_b$ ). Cross-modal confirmation with camera or lidar is a standard robustification technique in automotive.

Multi-Target Tracking on the DD Grid