Maximum-Likelihood Sequence Estimation

Optimal Detection on ISI Channels

Both linear equalizers and the DFE make decisions symbol by symbol, without considering the entire received sequence. The optimal approach --- maximum-likelihood sequence estimation (MLSE) --- searches over all possible transmitted sequences to find the one most likely to have produced the observed output. MLSE is implemented efficiently using the Viterbi algorithm, the same dynamic programming technique used for convolutional code decoding. The price is exponential complexity in channel memory.

Definition:

Maximum-Likelihood Sequence Estimation (MLSE)

Given a received sequence y=[y0,y1,…,yKβˆ’1]\mathbf{y} = [y_0, y_1, \ldots, y_{K-1}] and channel h[n],β€…β€Šn=0,…,Lh[n],\; n = 0, \ldots, L, the MLSE detector finds

a^=arg⁑max⁑aβ€…β€Šp(y∣a)\hat{\mathbf{a}} = \arg\max_{\mathbf{a}} \; p(\mathbf{y} \mid \mathbf{a})

For AWGN, this is equivalent to minimising the sequence metric:

a^=arg⁑min⁑aβˆ‘k=0Kβˆ’1∣ykβˆ’βˆ‘n=0Lh[n] akβˆ’n∣2\hat{\mathbf{a}} = \arg\min_{\mathbf{a}} \sum_{k=0}^{K-1} \left| y_k - \sum_{n=0}^{L} h[n]\, a_{k-n} \right|^2

The minimisation is over all MKM^K possible symbol sequences, where MM is the alphabet size and KK is the sequence length. Direct enumeration is infeasible, but the ISI channel has a trellis structure that enables the Viterbi algorithm.

,

Definition:

Branch Metric

In the Viterbi algorithm for MLSE, the branch metric at time kk for a state transition from Οƒkβˆ’1\sigma_{k-1} to Οƒk\sigma_k (corresponding to input symbol aka_k) is

Ξ»k(Οƒkβˆ’1,Οƒk)=∣ykβˆ’βˆ‘n=0Lh[n] akβˆ’n∣2\lambda_k(\sigma_{k-1}, \sigma_k) = \left| y_k - \sum_{n=0}^{L} h[n]\, a_{k-n} \right|^2

The path metric is the cumulative sum of branch metrics along a path through the trellis. The Viterbi algorithm retains only the minimum-metric survivor path entering each state.

Theorem: Optimality of MLSE

For an ISI channel with AWGN, the MLSE detector minimises the sequence error probability P(a^β‰ a)P(\hat{\mathbf{a}} \neq \mathbf{a}) among all detectors. The pairwise error probability between sequences a\mathbf{a} and aβ€²\mathbf{a}' is

P(aβ†’aβ€²)=Q ⁣(βˆ₯H(aβˆ’aβ€²)βˆ₯22N0)P(\mathbf{a} \to \mathbf{a}') = Q\!\left( \sqrt{\frac{\|\mathbf{H}(\mathbf{a} - \mathbf{a}')\|^2}{2 N_0}}\right)

where H\mathbf{H} is the channel convolution matrix. The performance is dominated by the minimum distance between valid sequences:

dfree2=min⁑aβ‰ aβ€²βˆ₯H(aβˆ’aβ€²)βˆ₯2d_{\text{free}}^2 = \min_{\mathbf{a} \neq \mathbf{a}'} \|\mathbf{H}(\mathbf{a} - \mathbf{a}')\|^2

MLSE is optimal because it considers the entire received sequence jointly, exploiting the trellis structure of ISI. The Viterbi algorithm makes this tractable by discarding suboptimal paths at each trellis stage. The free distance dfreed_{\text{free}} plays the same role as dmin⁑d_{\min} in memoryless detection --- it determines the asymptotic BER.

Viterbi Algorithm for ISI Channels

Complexity: Time: O(Kβ‹…MLβ‹…M)O(K \cdot M^L \cdot M) per sequence, where MLM^L is the number of trellis states and MM transitions leave each state. Space: O(Kβ‹…ML)O(K \cdot M^L) for storing survivor paths. For binary signalling (M=2M = 2) and channel memory LL: 2L2^L states, 2L+12^{L+1} branch metric computations per time step.
Input: Received sequence {y0,…,yKβˆ’1}\{y_0, \ldots, y_{K-1}\}, channel taps h[0],…,h[L]h[0], \ldots, h[L], alphabet A\mathcal{A}
Output: ML sequence estimate a^\hat{\mathbf{a}}
1. Initialise: Set path metric Ξ“0(Οƒ)=0\Gamma_0(\sigma) = 0 for the all-zero state, ∞\infty otherwise.
2. For k=0,1,…,Kβˆ’1k = 0, 1, \ldots, K-1:
a. For each state Οƒk\sigma_k (representing (akβˆ’1,…,akβˆ’L)(a_{k-1}, \ldots, a_{k-L})):
- For each predecessor state Οƒkβˆ’1\sigma_{k-1} with transition symbol aka_k:
Ξ»k=∣ykβˆ’βˆ‘n=0Lh[n] akβˆ’n∣2\lambda_k = \left| y_k - \sum_{n=0}^{L} h[n]\, a_{k-n} \right|^2
Ξ“kcand(Οƒk)=Ξ“kβˆ’1(Οƒkβˆ’1)+Ξ»k\Gamma_k^{\text{cand}}(\sigma_k) = \Gamma_{k-1}(\sigma_{k-1}) + \lambda_k
- Select survivor: Ξ“k(Οƒk)=min⁑σkβˆ’1Ξ“kcand(Οƒk)\Gamma_k(\sigma_k) = \min_{\sigma_{k-1}} \Gamma_k^{\text{cand}}(\sigma_k)
- Store the predecessor that achieved the minimum (survivor path).
3. Traceback: Starting from the state with minimum Ξ“Kβˆ’1(Οƒ)\Gamma_{K-1}(\sigma), trace back through the stored predecessors to recover a^\hat{\mathbf{a}}.

The complexity grows exponentially with channel memory LL. For L>5L > 5--66, the Viterbi algorithm becomes impractical and reduced-state approaches (RSSE, delayed decision-feedback sequence estimation) are needed.

,

BER Comparison: Linear EQ vs. DFE vs. MLSE

Compare the BER performance of four receivers on a multipath channel: no equalization, linear MMSE, MMSE-DFE, and MLSE. The matched-filter bound is shown as a reference.

Parameters
0.6
0
11

Viterbi Algorithm Step by Step

Watch the Viterbi algorithm process received samples on a 2-state ISI trellis. Survivor paths (green) are retained; suboptimal paths (red) are discarded at each stage.
The Viterbi algorithm retains one survivor per state, pruning the exponential search space to linear complexity per time step.

Example: MLSE via Viterbi on a Two-Tap Channel

Apply the Viterbi algorithm to detect the BPSK sequence transmitted over the channel h=[1,β€…β€Š0.7]h = [1,\; 0.7], given the received samples y=[0.8,β€…β€Š1.5,β€…β€Šβˆ’0.4,β€…β€Šβˆ’1.6]\mathbf{y} = [0.8,\; 1.5,\; -0.4,\; -1.6]. Assume zero initial state and no noise.

Quick Check

For a channel with memory L=4L = 4 and 4-QAM modulation (M=4M = 4), how many trellis states does the Viterbi MLSE detector have?

44=2564^4 = 256 states

24=162^4 = 16 states

4Γ—4=164 \times 4 = 16 states

4+4=84 + 4 = 8 states

Why This Matters: MLSE in GSM

The GSM cellular standard specifies a channel equalizer based on the Viterbi algorithm. The GSM channel is modelled with a typical memory of L=4L = 4--55 symbol periods. With GMSK modulation (approximately binary), the trellis has 25=322^5 = 32 states --- computationally feasible for a mobile handset. Each GSM burst includes a 26-bit training sequence (midamble) used to estimate the channel taps before running the Viterbi equalizer. This combination of training-based channel estimation and MLSE equalization was a key innovation that enabled reliable communication over the highly dispersive urban radio channel.

See full treatment in Chapter 14

Historical Note: Viterbi Algorithm

1967--1974

Andrew Viterbi published his algorithm in 1967 for decoding convolutional codes, but it was soon recognised as a general solution for maximum-likelihood detection on any system with a finite-state trellis representation. Forney (1972) showed that the ISI channel has a natural trellis structure and that the Viterbi algorithm provides MLSE. Ungerboeck (1974) combined this with adaptive channel estimation, creating the first practical MLSE receiver for data modems. The Viterbi algorithm remains one of the most widely used algorithms in communications, with applications in equalization, decoding, speech recognition, and bioinformatics.

⚠️Engineering Note

MLSE in Wireless Standards β€” Where It Works and Where It Fails

MLSE via the Viterbi algorithm has been deployed in real systems, but only when the state count is manageable:

  • GSM (2G): GMSK modulation (Mβ‰ˆ2M \approx 2), L=5L = 5 gives 25=322^5 = 32 states. Feasible on 1990s hardware. Every GSM phone contains a Viterbi equalizer.
  • EDGE (2.5G): 8-PSK (M=8M = 8), L=4L = 4 gives 84=40968^4 = 4096 states. Reduced-state approaches (M-BCJR, DDFSE) are used to bring complexity to ∼128\sim 128 effective states.
  • 3G WCDMA: Replaced equalization with RAKE receivers and later chip-level equalizers (linear/DFE), avoiding MLSE entirely.
  • 4G LTE / 5G NR: OFDM eliminates the need for time-domain equalization. The only "equalization" is per-subcarrier MMSE.
  • Practical limit: ML≲256M^L \lesssim 256 states is the rough boundary for real-time MLSE on current hardware. Beyond this, reduced-state or turbo equalization is required.
Practical Constraints
  • β€’

    Real-time MLSE feasible for ML≲256M^L \lesssim 256 trellis states

  • β€’

    GSM: 32 states (GMSK, L=5L=5); EDGE: reduced from 4096 to ~128 effective states

  • β€’

    OFDM in 4G/5G eliminates the need for MLSE entirely

πŸ“‹ Ref: GSM 05.05 (equalization), 3GPP TS 45.004 (EDGE)

Key Takeaway

MLSE is the optimal receiver for ISI channels but its complexity grows as O(ML)O(M^L) β€” exponential in channel memory. This makes it practical only for low-order modulation on short channels (e.g., GSM). For modern wideband systems, OFDM avoids the problem entirely by converting the frequency-selective channel into flat subchannels.

Maximum-Likelihood Sequence Estimation (MLSE)

An optimal detection strategy that finds the transmitted symbol sequence maximising the likelihood of the entire received sequence, implemented via the Viterbi algorithm on the channel trellis.

Related: Viterbi Algorithm, Equalization

Viterbi Algorithm

A dynamic programming algorithm that finds the minimum-cost path through a trellis by retaining only one survivor path per state at each time step.

Related: Maximum-Likelihood Sequence Estimation (MLSE)

Branch Metric

The cost assigned to a single state transition in the Viterbi trellis, typically the squared Euclidean distance between the received sample and the predicted output for that transition.

Related: Viterbi Algorithm, Maximum-Likelihood Sequence Estimation (MLSE)