Examples of Factor Graphs

One Language, Many Models

The power of factor graphs is their universality. In this section we draw the factor graphs of six canonical problems from communications and signal processing. The reader will notice a striking pattern: the algorithms we developed separately for each β€” Viterbi for ISI, iterative decoding for LDPC, Kalman filtering for state space, MMSE for MIMO β€” are all the same algorithm applied to different factor graphs.

This is the payoff of the abstraction. Once we understand message passing on a factor graph (Chapter 18), we have understood all of these classical algorithms.

Definition:

Markov Chain as a Factor Graph

A Markov chain x1β†’x2β†’β‹―β†’xnx_1 \to x_2 \to \cdots \to x_n with transition probabilities p(xt+1∣xt)p(x_{t+1}|x_t) and initial distribution p(x1)p(x_1) has joint p(x1,…,xn)=p(x1)∏t=1nβˆ’1p(xt+1∣xt).p(x_1, \ldots, x_n) = p(x_1) \prod_{t=1}^{n-1} p(x_{t+1}|x_t). Its factor graph is a chain: variable nodes x1,…,xnx_1, \ldots, x_n alternating with factor nodes

  • f0(x1)=p(x1)f_0(x_1) = p(x_1) (prior),
  • ft(xt,xt+1)=p(xt+1∣xt)f_t(x_t, x_{t+1}) = p(x_{t+1}|x_t) for t=1,…,nβˆ’1t = 1, \ldots, n-1.

A chain is a tree, so exact marginals compute in O(n∣X∣2)O(n|\mathcal{X}|^2) time. The two-pass algorithm is the forward-backward algorithm β€” message passing from left to right, then right to left.

Definition:

Hidden Markov Model as a Factor Graph

An HMM has latent states x1,…,xnx_1, \ldots, x_n (Markov chain) and observations y1,…,yny_1, \ldots, y_n via emission distributions p(yt∣xt)p(y_t|x_t). Joint: p(x,y)=p(x1)∏t=1nβˆ’1p(xt+1∣xt)∏t=1np(yt∣xt).p(\mathbf{x}, \mathbf{y}) = p(x_1) \prod_{t=1}^{n-1} p(x_{t+1}|x_t) \prod_{t=1}^n p(y_t|x_t). Factor graph: a horizontal chain of state variables and transition factors, with an emission factor (and observation variable) hanging off each state.

Clamping y\mathbf{y} to observed values turns the emission factors into unary factors on the states. The graph is still a tree; forward-backward computes p(xt∣y)p(x_t|\mathbf{y}) exactly.

Factor Graph of a Hidden Markov Model

Factor Graph of a Hidden Markov Model
HMM factor graph: horizontal chain of state variables xtx_t connected by transition factors ftf_t, with emission factors gtg_t hanging down to observations yty_t.

Example: LDPC Code Tanner Graph

Construct the factor graph of a rate-1/2 LDPC code with parity-check matrix H=(111000001110100011)\mathbf{H} = \begin{pmatrix} 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 1 & 0 \\ 1 & 0 & 0 & 0 & 1 & 1 \end{pmatrix}. Interpret the role of the variable and factor nodes.

Tanner Graph of a Random (dv,dc)(d_v, d_c)-Regular LDPC Code

Visualize the Tanner graph of a random regular LDPC code. Move sliders to explore how the degree distribution controls the graph topology.

Parameters
12
3
6

Definition:

Convolutional Code Trellis as a Factor Graph

A rate-1/n1/n convolutional code with memory mm has state st∈{0,1}ms_t \in \{0, 1\}^m. The joint distribution over state and output sequences factorizes as p(s,c)=∏t=1T1{st,ctΒ consistentΒ withΒ encoder}.p(\mathbf{s}, \mathbf{c}) = \prod_{t=1}^T \mathbb{1}\{s_t, c_t \text{ consistent with encoder}\}. Factor graph: a chain of state variables s0,s1,…,sTs_0, s_1, \ldots, s_T and output variables c1,…,cTc_1, \ldots, c_T, connected by trellis factors ft(stβˆ’1,ct,st)=1{trellisΒ transitionΒ valid}f_t(s_{t-1}, c_t, s_t) = \mathbb{1}\{\text{trellis transition valid}\}.

The Viterbi algorithm is max-product message passing on this factor graph. The BCJR algorithm is sum-product message passing on the same graph.

Example: MIMO Detection as a Factor Graph

Construct the factor graph for the detection problem y=Hx+w\mathbf{y} = \mathbf{H}\mathbf{x} + \mathbf{w} with xn∈{βˆ’1,+1}x_n \in \{-1, +1\} (BPSK), w∼N(0,Οƒ2I)\mathbf{w} \sim \mathcal{N}(\mathbf{0}, \sigma^2\mathbf{I}), and uniform prior on x\mathbf{x}. What is the structure when H\mathbf{H} is full (all entries nonzero)?

Definition:

ISI Channel as a Factor Graph

An ISI channel yt=βˆ‘β„“=0Lhβ„“xtβˆ’β„“+wty_t = \sum_{\ell=0}^L h_\ell x_{t-\ell} + w_t corresponds to a factor graph where each observation yty_t imposes a factor ft(xt,xtβˆ’1,…,xtβˆ’L)∝N(yt;βˆ‘β„“hβ„“xtβˆ’β„“,Οƒ2)f_t(x_t, x_{t-1}, \ldots, x_{t-L}) \propto \mathcal{N}(y_t; \sum_\ell h_\ell x_{t-\ell}, \sigma^2) connecting L+1L+1 consecutive transmitted symbols. Prior factors gt(xt)g_t(x_t) enforce constellation constraints.

The tree-width of this graph is LL. When LL is small (say, 2-5), exact BCJR detection is feasible (∣X∣L|\mathcal{X}|^L states). For larger LL, turbo equalization with soft approximations takes over.

Canonical Factor Graph Structures

ModelFactor graph topologyTree-widthExact inference cost
Markov chain / HMMChain1O(n∣X∣2)O(n|\mathcal{X}|^2)
Convolutional codeTrellis (chain of states)mm (memory)O(nβ‹…2m)O(n \cdot 2^m)
LDPC codeSparse bipartite (Tanner graph)Large (loopy)Intractable β€” loopy BP
ISI channel, memory LLSliding windowLLO(n∣X∣L)O(n|\mathcal{X}|^L)
MIMO detection (H\mathbf{H} full)Complete bipartite KM,NK_{M,N}min⁑(M,N)\min(M,N)Intractable β€” approximations
Kalman filter (state space)Chain (linear Gaussian)1O(nβ‹…d3)O(n \cdot d^3) (dd = state dim)
Compressed sensing (sparse A\mathbf{A})Sparse bipartiteDepends on sparsityAMP / loopy BP

Why This Matters: Turbo Codes as Coupled Factor Graphs

The turbo principle from Book 1 Chapter 16 is a direct factor graph statement: two convolutional code graphs share variable nodes through an interleaver. Decoding iterates message passing in each sub-graph, passing soft information across. The turbo principle is message passing on a factor graph with two coupled sub-graphs β€” the coupling creates cycles of length equal to twice the interleaver size.

Historical Note: Tanner, Wiberg, and the Unification

1981-2009

Michael Tanner (1981) introduced bipartite graphs to describe low-density codes, giving the first geometric view of decoding. Niclas Wiberg's 1996 PhD thesis formalized factor graphs and derived message-passing algorithms for them in full generality. Kschischang, Frey, and Loeliger (2001) consolidated the language in a landmark tutorial, after which factor graphs became the lingua franca for coding, signal processing, and machine learning. The modern generalization to probabilistic graphical models (Koller-Friedman 2009) extended the framework to massive-scale inference.

Factor Graphs Across Disciplines

The same language is used in:

  • Coding theory: Tanner graphs, LDPC/turbo decoders
  • Signal processing: Kalman filters, HMMs, ISI equalizers
  • Machine learning: Bayesian networks, Markov random fields, variational inference
  • Statistics: Bayesian hierarchical models
  • Physics: Statistical mechanics on lattices, spin glasses The connections are not metaphorical β€” they are literal. The same algorithm, rewritten in domain-appropriate notation, solves problems in each of these fields.

Quick Check

What is the tree-width of an HMM factor graph with nn states and arbitrary emission alphabets?

1

∣X∣|\mathcal{X}|

nn

log⁑n\log n

Key Takeaway

HMMs, LDPC codes, ISI channels, MIMO detection, and Kalman filters all admit natural factor graph representations. The graph topology (chain, trellis, bipartite, loopy) determines whether inference is tractable exactly or requires approximations. Message passing on the graph unifies the algorithms.