Examples of Factor Graphs
One Language, Many Models
The power of factor graphs is their universality. In this section we draw the factor graphs of six canonical problems from communications and signal processing. The reader will notice a striking pattern: the algorithms we developed separately for each β Viterbi for ISI, iterative decoding for LDPC, Kalman filtering for state space, MMSE for MIMO β are all the same algorithm applied to different factor graphs.
This is the payoff of the abstraction. Once we understand message passing on a factor graph (Chapter 18), we have understood all of these classical algorithms.
Definition: Markov Chain as a Factor Graph
Markov Chain as a Factor Graph
A Markov chain with transition probabilities and initial distribution has joint Its factor graph is a chain: variable nodes alternating with factor nodes
- (prior),
- for .
A chain is a tree, so exact marginals compute in time. The two-pass algorithm is the forward-backward algorithm β message passing from left to right, then right to left.
Definition: Hidden Markov Model as a Factor Graph
Hidden Markov Model as a Factor Graph
An HMM has latent states (Markov chain) and observations via emission distributions . Joint: Factor graph: a horizontal chain of state variables and transition factors, with an emission factor (and observation variable) hanging off each state.
Clamping to observed values turns the emission factors into unary factors on the states. The graph is still a tree; forward-backward computes exactly.
Factor Graph of a Hidden Markov Model
Example: LDPC Code Tanner Graph
Construct the factor graph of a rate-1/2 LDPC code with parity-check matrix . Interpret the role of the variable and factor nodes.
Variable nodes
Six variable nodes , one per code bit.
Factor nodes
Three factor nodes, one per parity check:
- Each factor is an indicator that a parity check is satisfied.
Code distribution
For a uniform prior over codewords, . Decoding from channel observations multiplies these factors by channel likelihoods β these become additional unary factors on each .
The Tanner graph
This bipartite graph is called the Tanner graph of the code. Variable nodes = code bits; check nodes = parity constraints. Each bit participates in multiple checks and each check covers a handful of bits β hence "low-density".
Tanner Graph of a Random -Regular LDPC Code
Visualize the Tanner graph of a random regular LDPC code. Move sliders to explore how the degree distribution controls the graph topology.
Parameters
Definition: Convolutional Code Trellis as a Factor Graph
Convolutional Code Trellis as a Factor Graph
A rate- convolutional code with memory has state . The joint distribution over state and output sequences factorizes as Factor graph: a chain of state variables and output variables , connected by trellis factors .
The Viterbi algorithm is max-product message passing on this factor graph. The BCJR algorithm is sum-product message passing on the same graph.
Example: MIMO Detection as a Factor Graph
Construct the factor graph for the detection problem with (BPSK), , and uniform prior on . What is the structure when is full (all entries nonzero)?
Factorize the posterior
where and for .
Factor graph structure
- Variable nodes: (each binary)
- Factor nodes: (observation likelihoods)
- Edges: iff
Dense case
When is full, every factor connects to every variable β a complete bipartite graph . This is extremely loopy. Short cycles of length 4 abound. Loopy BP on this graph does not converge well; this is why MIMO detection uses Gaussian BP, expectation propagation, or AMP (Chapters 18-20).
Sparse (ISI) case
For an ISI channel, is banded Toeplitz with bandwidth (channel memory). Each touches only consecutive variables, so the graph is a chain-like structure with tree-width . This is what makes BCJR exact and efficient.
Definition: ISI Channel as a Factor Graph
ISI Channel as a Factor Graph
An ISI channel corresponds to a factor graph where each observation imposes a factor connecting consecutive transmitted symbols. Prior factors enforce constellation constraints.
The tree-width of this graph is . When is small (say, 2-5), exact BCJR detection is feasible ( states). For larger , turbo equalization with soft approximations takes over.
Canonical Factor Graph Structures
| Model | Factor graph topology | Tree-width | Exact inference cost |
|---|---|---|---|
| Markov chain / HMM | Chain | 1 | |
| Convolutional code | Trellis (chain of states) | (memory) | |
| LDPC code | Sparse bipartite (Tanner graph) | Large (loopy) | Intractable β loopy BP |
| ISI channel, memory | Sliding window | ||
| MIMO detection ( full) | Complete bipartite | Intractable β approximations | |
| Kalman filter (state space) | Chain (linear Gaussian) | 1 | ( = state dim) |
| Compressed sensing (sparse ) | Sparse bipartite | Depends on sparsity | AMP / loopy BP |
Why This Matters: Turbo Codes as Coupled Factor Graphs
The turbo principle from Book 1 Chapter 16 is a direct factor graph statement: two convolutional code graphs share variable nodes through an interleaver. Decoding iterates message passing in each sub-graph, passing soft information across. The turbo principle is message passing on a factor graph with two coupled sub-graphs β the coupling creates cycles of length equal to twice the interleaver size.
Historical Note: Tanner, Wiberg, and the Unification
1981-2009Michael Tanner (1981) introduced bipartite graphs to describe low-density codes, giving the first geometric view of decoding. Niclas Wiberg's 1996 PhD thesis formalized factor graphs and derived message-passing algorithms for them in full generality. Kschischang, Frey, and Loeliger (2001) consolidated the language in a landmark tutorial, after which factor graphs became the lingua franca for coding, signal processing, and machine learning. The modern generalization to probabilistic graphical models (Koller-Friedman 2009) extended the framework to massive-scale inference.
Factor Graphs Across Disciplines
The same language is used in:
- Coding theory: Tanner graphs, LDPC/turbo decoders
- Signal processing: Kalman filters, HMMs, ISI equalizers
- Machine learning: Bayesian networks, Markov random fields, variational inference
- Statistics: Bayesian hierarchical models
- Physics: Statistical mechanics on lattices, spin glasses The connections are not metaphorical β they are literal. The same algorithm, rewritten in domain-appropriate notation, solves problems in each of these fields.
Quick Check
What is the tree-width of an HMM factor graph with states and arbitrary emission alphabets?
1
The HMM graph is a chain plus leaves β still a tree, tree-width 1.
Key Takeaway
HMMs, LDPC codes, ISI channels, MIMO detection, and Kalman filters all admit natural factor graph representations. The graph topology (chain, trellis, bipartite, loopy) determines whether inference is tractable exactly or requires approximations. Message passing on the graph unifies the algorithms.