State-Space Models
Why State-Space Models?
So far our estimation machinery has assumed a joint distribution of signal and observation that was given and static. In reality, most interesting estimation problems evolve: a target moves, a channel fades, a bearing bounces, a user drifts. The signal we want has a dynamics, and observations accumulate over time. The question is how to combine a model of the dynamics with a model of the measurements to produce estimates that get better as observations arrive.
The state-space model is the cleanest answer available. It says: keep a compact summary of "everything you need to know about the past" in a vector β the state β such that the future evolves as a Markov chain driven by noise, and the observations depend only on the current state. This separation of concerns is the single most useful idea in estimation for dynamical systems, and the Kalman filter is what it buys us once the model is linear and Gaussian.
Definition: Discrete-Time Linear Gaussian State-Space Model
Discrete-Time Linear Gaussian State-Space Model
A discrete-time linear Gaussian state-space model (LGSS) consists of two coupled equations: the state equation and the observation equation for , together with the distributional assumptions with , , and mutually independent, and both noise sequences white. The matrices , , , , are deterministic and, in the time-invariant case, independent of .
The positive-definite assumption is there to avoid singular innovations; may be only positive semidefinite, since some state coordinates (e.g. constant velocity in a constant-velocity model) are not driven by noise at all.
Definition: Filtering, Prediction, Smoothing
Filtering, Prediction, Smoothing
Let . Three classical estimation tasks are distinguished by the relation between the target time index and the observation horizon :
- Filtering (): compute β the best estimate of the current state given all observations up to and including now.
- Prediction (): compute β the best forecast of a future state.
- Smoothing (): compute with β a retrospective estimate using both past and future observations.
The Kalman filter addresses filtering and one-step prediction jointly. Smoothing requires a backward recursion and is treated in Exercise 14.
Definition: Innovation Sequence
Innovation Sequence
The innovation at time is the one-step prediction residual of the observation: The innovation covariance is
The innovations play the role that the whitened observations played in Chapter 9: they form an uncorrelated basis for the linear span of the observations, and conditioning on is equivalent to conditioning on .
Definition: Controllability and Observability
Controllability and Observability
For the time-invariant pair , the controllability Gramian is The pair is controllable (or more precisely, stabilizable if is unstable) when has full rank .
For , the observability Gramian is and the pair is observable (or detectable) when has full rank.
These two conditions will reappear in Section 10.3 as the hypotheses under which the Riccati recursion has a unique stabilizing fixed point.
Definition: Markov Property of the State
Markov Property of the State
Conditional on , the future is independent of the past and of the past observations . Equivalently,
This is the structural reason the Kalman filter is recursive: all the information that past observations contain about the future is compressed into the current state estimate and its covariance. Nothing else needs to be remembered.
Theorem: Propagation of State Moments
Let and for the LGSS model of Definition 10.1. Then for all : Moreover, is Gaussian for every , so in distribution.
The mean evolves deterministically through the noise-free dynamics β zero-mean driving noise cannot shift the mean. The covariance grows by two mechanisms: the old uncertainty rotates through , and fresh process noise is injected at every step. This is the Lyapunov-type recursion that underlies Kalman prediction.
Mean recursion
Take expectations on both sides of the state equation: using .
Covariance recursion
Define . Then so where the cross term vanishes because is independent of (and therefore of ). Substituting gives the claim.
Gaussianity
Each is an affine function of and , all of which are jointly Gaussian and mutually independent. Affine transformations of a Gaussian vector are Gaussian, so is Gaussian with the computed mean and covariance.
Example: Constant-Velocity Target in One Dimension
A particle moves along a line. Let the state be , with position and velocity. Over a sampling interval , the kinematic model , , with random acceleration , yields the matrices of an LGSS. The sensor measures only the position, corrupted by noise . Write the LGSS matrices and compute the two-step predicted covariance starting from (perfectly known initial state).
State and observation equations
Stacking the kinematic equations gives The noise covariances are scalars here: and .
Process-noise covariance in state coordinates
The effective state-noise covariance is Notice that position and velocity noise are positively correlated β a random kick to the acceleration moves both.
One-step prediction
From and Theorem 10.1,
Two-step prediction
Applying the recursion once more, Carrying out the matrix products (reader should verify), The position variance grows roughly like and the velocity variance like β unobserved motion becomes increasingly uncertain, fast.
Realizations of a Linear Gaussian State-Space Model
Sample trajectories of the state and observations for the constant-velocity model. As increases, trajectories diverge faster; as increases, observations become noisier around the true path.
Parameters
Common Mistake: Coloured Noise Is Not Allowed (Directly)
Mistake:
Students often apply the Kalman filter with process or observation noise that is correlated across time β e.g., generated as a filtered version of a white sequence.
Correction:
The derivation requires and to be white (temporally uncorrelated). If the noise is coloured, the standard recursion is no longer optimal. The standard fix is to augment the state: model the coloured noise itself as the output of a linear system driven by white noise, append its state to , and rewrite the model. The filter then runs on the augmented state. This is a routine manoeuvre, but forgetting to do it silently destroys optimality.
Block Diagram of a Linear Gaussian State-Space Model
Key Takeaway
A linear Gaussian state-space model is a Markov chain in whose transition is a linear Gaussian kernel and whose observations are linear Gaussian functions of the state. Two recursions β one for the mean, one for the covariance β fully describe its marginal moments. The Kalman filter is what you get when you condition those moments on observations.