The Viterbi Algorithm for MLSE
From Brute Force to Dynamic Programming
Theorem TMarkov Structure of the ISI Likelihood established that the ML metric is additive along paths through a finite state graph. Whenever an optimization objective decomposes this way, dynamic programming applies: at each stage, discard any partial path that cannot possibly be extended to the global optimum. For finite-memory channels the state graph is called a trellis, and the dynamic-programming procedure is the Viterbi algorithm. The same machine appears in convolutional decoding, hidden Markov models, and speech recognition β it is one of the most widely applied structural algorithms in all of engineering.
The payoff is dramatic. Brute-force MLSE examines sequences. Viterbi examines state transitions. For , , the ratio is versus β about fewer operations, and the answer is identical.
Definition: Trellis
Trellis
The trellis of an ISI channel with alphabet and memory is the directed graph whose vertices are pairs with , and whose edges connect to whenever the transition is consistent with shifting in a new symbol :
Each vertex has in-degree (one predecessor per choice of ) and out-degree (one successor per choice of ). An edge carries the branch metric .
Definition: Path Metric and Survivor
Path Metric and Survivor
The path metric at node is the minimum cumulative branch cost along any path from the known initial state to state at time :
The survivor at is the unique path attaining this minimum (ties broken arbitrarily). The surviving predecessor of is the state such that .
Only one survivor is kept per state per time step. The edges arriving at each are compared, the best is retained, and the other are discarded. This pruning is what keeps the algorithm's memory bounded β at any instant only survivors are alive, regardless of how far back the trellis extends.
Theorem: Viterbi Correctness: Dynamic-Programming Recursion
Given the additive branch metrics of TMarkov Structure of the ISI Likelihood, the path metrics satisfy the recursion
where is the set of predecessors of state . The state together with the chain of surviving predecessors yields , the exact solution of the MLSE problem.
Bellman's principle of optimality: an optimal path through the trellis has the property that every sub-path (from the start to any intermediate node) is itself optimal for reaching that node. Hence if two partial paths end at the same state, only the one with smaller cumulative metric can ever be a prefix of the global optimum β the other can be discarded without loss.
Use induction on , assuming is the cost of the best length- path to for every .
Any length- path to decomposes into a length- path to some predecessor followed by the edge .
Minimize over to obtain .
Base case
At time the only state with finite cost is the known initial state , and by the empty-sum convention. All other states have , reflecting that no trellis path has yet reached them.
Inductive hypothesis
Assume that at time , for every , equals the cumulative branch-metric cost of the best length- path from to .
Path decomposition
Any length- trellis path ending at state has a unique predecessor state visited at time . Its cumulative cost is the cost of the length- prefix plus the edge metric .
Minimizing over predecessors
Minimizing the cumulative cost over all length- paths to therefore decomposes as
which is the stated recursion.
Global optimality and traceback
By induction equals the minimum cumulative cost of any length- path to . The minimum over terminal states identifies the optimal path's endpoint. Walking backward along surviving predecessors recovers the entire optimal state sequence, from which is read off. Because the log-likelihood equals plus a constant (theorem TMarkov Structure of the ISI Likelihood), minimizing the cumulative metric is equivalent to maximizing the likelihood.
Key Takeaway
The Viterbi algorithm converts the exponential-in- MLSE search into a forward sweep with additions and comparisons per time step. Its correctness follows directly from the Markov factorization of the log-likelihood and Bellman's principle.
Viterbi Algorithm for MLSE
Complexity: operations, memory (or with sliding-window traceback)In practice the full traceback array is prohibitive for long blocks. A sliding window of columns is maintained; paths older than are deemed to have merged (an empirical fact for most channels). This converts the memory cost to with negligible loss in performance.
Viterbi Trellis Traversal on a Two-Tap BPSK Channel
Watch the algorithm build path metrics and highlight the maximum-likelihood path through a two-state trellis. Adjust the SNR and the channel tap to see how noise and ISI strength shape the surviving paths.
Parameters
The Viterbi Algorithm Step by Step
Example: Complete Viterbi Trace for
Using the same channel , BPSK alphabet, and initial state as in EBranch Metrics for a Two-Tap Channel with BPSK, now suppose the receiver observes . Run the Viterbi forward recursion to completion, record all surviving path metrics, and perform the traceback to recover .
Initialization
The state space is (labelled by the most recent symbol). Set , .
Step $k=0$, observation $y[0]=1.2$
From the known state only, two outgoing branches exist:
Surviving predecessors: .
Step $k=1$, observation $y[1]=-0.7$
Evaluate all four incoming branches at each state:
To (i.e. ):
- from : , , cand
- from : , , cand
To (i.e. ):
- from : , , cand
- from : , , cand
Therefore with , and with .
Step $k=2$, observation $y[2]=-1.4$
To (i.e. ):
- from : , , cand
- from : , , cand
To (i.e. ):
- from : , , cand
- from : , , cand
Thus with , and with .
Traceback
The terminal minimum is , so . Following : , , and the known .
Reading off the symbols (each state equals the most recent symbol):
Compare to the brute-force total from EBranch Metrics for a Two-Tap Channel with BPSK: Viterbi arrived at the identical answer without enumerating any path of length .
Brute-Force MLSE vs Viterbi
| Quantity | Brute-force MLSE | Viterbi |
|---|---|---|
| Number of sequences / paths examined | ||
| Memory | per path, paths | for traceback window |
| Output | Exact ML sequence | Exact ML sequence |
| Dependence on block length | Exponential | Linear |
| Dependence on channel memory | None (in complexity) | Exponential |
| Dependence on alphabet size | Exponential () | Linear in , exponential in |
Common Mistake: Infinite Traceback Is Unnecessary β But Don't Cut It Too Short
Mistake:
Readers sometimes either (a) store the entire traceback matrix, wasting memory for long blocks, or (b) pick a traceback depth that is too short, causing noisy or incorrect decisions near the current time.
Correction:
A rule of thumb is to use a sliding traceback window of to . Empirically this is enough for surviving paths to have merged into a common prefix with probability close to one, so the decision at time is nearly as good as with full traceback. Cutting below about produces noticeable BER degradation.
Quick Check
A system uses 16-QAM modulation () over a channel with memory . Roughly how many multiply-accumulates does the Viterbi algorithm require per transmitted symbol?
(for block length 100)
The trellis has states, and at each time step each state has incoming branches to compare β roughly branch metric evaluations per received sample. This is orders of magnitude larger than a linear equalizer, which is why Viterbi for 16-QAM is usually avoided for .
Historical Note: Ungerboeck's Whitened-Matched-Filter Receiver (1974)
1970sTwo years after Forney, Gottfried Ungerboeck showed in a 1974 paper that the Viterbi algorithm can be applied after a whitened matched filter front end, producing branch metrics of a particularly simple form that directly expose the channel's minimum-phase equivalent. Ungerboeck's formulation is the one implemented in most wireline modem receivers and in the GSM/EDGE equalizer; it also provides the conceptual bridge from MLSE to the MMSE-DFE that closes this chapter. Together, Forney (1972) and Ungerboeck (1974) defined the modern textbook treatment of sequence estimation over ISI channels.
Viterbi algorithm
A dynamic-programming procedure that computes the minimum-cost path through a trellis by propagating surviving path metrics forward one stage at a time, pruning dominated partial paths. Applied to the ISI trellis, it produces the maximum-likelihood symbol sequence in time linear in the block length.
Related: Maximum-likelihood sequence estimation (MLSE), Trellis
Trellis
The time-unrolled state graph of a finite-memory channel or code. Vertices are (time, state) pairs; edges correspond to state transitions driven by the input symbol; each edge carries a branch metric used by the Viterbi algorithm to rank candidate paths.
Related: Viterbi algorithm
Why This Matters: Viterbi-MLSE in GSM
The GSM receiver runs a Viterbi equalizer on a channel of nominal memory with a binary-like Gaussian MSK alphabet. Because the alphabet is effectively after pre-processing, the trellis has states β very small. This is exactly the regime where MLSE is cheap and optimal. When LTE moved to wider bandwidths and much longer delay spreads, the trellis became infeasibly large and OFDM was adopted precisely because its per-subcarrier processing side-steps the trellis entirely (see Book 1, Chapter 14).