Ferkans — Interactive Telecom Tutor

ch17-ex01

Easy

Draw the factor graph of the joint distribution $p(x_1, x_2, x_3, x_4) = f_A(x_1) f_B(x_1, x_2) f_C(x_2, x_3) f_D(x_3, x_4)$ . Is it a tree?

Show Hint

List the neighborhood of each factor.

Count edges vs. nodes.

Solution

Neighborhoods

$\partial A = \{1\}, \partial B = \{1,2\}, \partial C = \{2,3\}, \partial D = \{3,4\}$ .

Tree check

4 variable nodes + 4 factor nodes = 8 nodes. Edges: $1 + 2 + 2 + 2 = 7$ . $E = V + F - 1 = 8 - 1 = 7$ . ✓ This is a tree (in fact a chain with a leaf factor at node 1).

ch17-ex02

Easy

The joint distribution $p(x_1, x_2, x_3) = f(x_1, x_2, x_3)$ (a single factor touching all three variables) has what factor graph structure?

Show Hint

One factor, three variables.

Solution

Draw

A single factor node $f$ connected to all three variable nodes. This is a star with the factor at the center. It is a tree (no cycles).

Cost

Inference cost is $|\mathcal{X}|^3$ — the factor's own size. Factor graphs do not make dense factors cheap; they just make the structure explicit.

ch17-ex03

Medium

Consider a Markov chain $x_1 \to x_2 \to x_3 \to x_4$ with binary states and all transition probabilities equal to $p(x_{t+1} = x_t | x_t) = 0.9$ . Run sum-product message passing to compute $p(x_2 | x_1 = 0, x_4 = 1)$ . Initial distribution $p(x_1) = 1$ at state 0 (clamped).

Show Hint

Clamp the observations at $x_1$ and $x_4$ .

Forward messages from $x_1$ , backward messages from $x_4$ .

Solution

Transition kernel

$\mathbf{T} = \begin{pmatrix} 0.9 & 0.1 \\ 0.1 & 0.9 \end{pmatrix}$ for states $\{0,1\}$ .

Forward message at $x_2$

Starting from $x_1 = 0$ : $\mu_{\text{fwd}}(x_2) = T[0, x_2] = (0.9, 0.1)$ .

Backward message at $x_2$

Backward from $x_4 = 1$ : compute $\mu_{\text{bwd}}(x_3) = T[x_3, 1] = (0.1, 0.9)$ . Then $\mu_{\text{bwd}}(x_2) = \sum_{x_3} T[x_2, x_3] \cdot \mu_{\text{bwd}}(x_3)$ $= (0.9 \cdot 0.1 + 0.1 \cdot 0.9, 0.1 \cdot 0.1 + 0.9 \cdot 0.9) = (0.18, 0.82)$ .

Combine and normalize

$b(x_2) \propto (0.9, 0.1) \cdot (0.18, 0.82) = (0.162, 0.082)$ , so $p(x_2 = 0 | x_1 = 0, x_4 = 1) = 0.162/0.244 \approx 0.664$ . $p(x_2 = 1 | \ldots) \approx 0.336$ .

ch17-ex04

Medium

Show that a 2D grid factor graph ( $k \times k$ variables with pairwise neighbor factors) has tree-width at least $k$ . Why does this make exact inference intractable?

Show Hint

Tree-width equals the minimum over elimination orderings of max-clique size.

Any elimination leaves a 'frontier' of at least $k$ variables.

Solution

Frontier argument

Regardless of elimination order, at some point a horizontal or vertical line of $k$ variables must be active (connects upper and lower halves). Eliminating any variable on that line creates a clique among the others.

Exact cost

Exact inference cost is $|\mathcal{X}|^{\text{tree-width} + 1} \geq |\mathcal{X}|^{k+1}$ . For $k = 10, |\mathcal{X}| = 2$ : $2^{11} = 2048$ . For $k = 20$ : $2^{21} \approx 2\times 10^6$ . For $k = 30$ : $2^{31} \approx 2\times 10^9$ . Exact inference on a $30\times30$ Ising grid is already at the edge of feasibility.

ch17-ex05

Medium

For the Tanner graph in Example ch17-ex-ldpc-tanner, check for 4-cycles. A 4-cycle in a Tanner graph is a pair of variable nodes sharing two common check nodes. Identify any.

Show Hint

Two rows of $\mathbf{H}$ sharing two columns' 1's form a 4-cycle.

Solution

List the 1s per row

Row 1: columns $\{1,2,3\}$ . Row 2: $\{3,4,5\}$ . Row 3: $\{1,5,6\}$ .

Find shared pairs

Rows 1&2: share column 3 only. Rows 1&3: share column 1 only. Rows 2&3: share column 5 only. No two rows share 2 or more columns.

Conclusion

No 4-cycles. Girth $\geq 6$ . This matrix would be considered acceptable for loopy BP decoding (absent longer cycles that degrade performance).

ch17-ex06

Medium

Draw the factor graph of the ISI channel $y_t = x_t + 0.5 x_{t-1} + w_t$ , $t = 1, \ldots, 4$ , with clamped observations. Compute the tree-width.

Show Hint

Each $f_t$ touches $\{x_{t-1}, x_t\}$ .

Solution

Factor graph

Variable nodes $x_0, x_1, x_2, x_3, x_4$ (treating $x_0$ as a boundary). Factors: $f_1(x_0, x_1), f_2(x_1, x_2), f_3(x_2, x_3), f_4(x_3, x_4)$ . Prior factors $g_t(x_t)$ on each variable.

Topology and tree-width

Chain structure, tree-width 1 (pairwise factors on a chain). BCJR runs in $O(n|\mathcal{X}|^2)$ .

ch17-ex07

Hard

Prove that the sum-product algorithm on a tree computes the exact marginal $p(x_i) = \prod_{a \in \partial i} \mu_{a \to i}(x_i)$ up to normalization.

Show Hint

Induct on the number of nodes.

Use the factorization-by-subtrees structure.

Solution

Induction hypothesis

For a tree with $n$ nodes, sum-product computes exact marginals. Base case: $n = 1$ (single node) is trivial.

Root at $i$

Rooting at variable node $i$ , each neighbor factor $a \in \partial i$ is the root of a subtree $\mathcal{T}_a$ containing variables $\mathbf{x}_{\mathcal{T}_a}$ .

Factorization

$p(x_i) = \sum_{\mathbf{x}_{-i}} \prod_a \left[\prod_{b \in \mathcal{T}_a} f_b\right]$ . Since the subtrees share only $x_i$ , the sum factorizes: $p(x_i) \propto \prod_a \sum_{\mathbf{x}_{\mathcal{T}_a \setminus i}} \prod_{b \in \mathcal{T}_a} f_b = \prod_a \mu_{a \to i}(x_i)$ , by definition of the message.

ch17-ex08

Easy

A factor graph has 12 variable nodes, 8 factor nodes, and 24 edges. Is it a tree? If not, how many independent cycles does it have?

Show Hint

Tree: $E = V + F - 1$ .

Solution

Check

$V + F - 1 = 12 + 8 - 1 = 19$ . $E = 24 > 19$ . Not a tree.

Cyclomatic number

$E - V - F + 1 = 24 - 12 - 8 + 1 = 5$ . The graph has 5 independent cycles (in the cycle space).

ch17-ex09

Medium

Convert the Bayesian network $p(a, b, c, d) = p(a) p(b|a) p(c|a) p(d|b, c)$ to a factor graph.

Show Hint

Each conditional becomes a factor.

Solution

Identify factors

$f_A = p(a), f_B = p(b|a), f_C = p(c|a), f_D = p(d|b,c)$ .

Neighborhoods

$\partial A = \{a\}, \partial B = \{a, b\}, \partial C = \{a, c\}, \partial D = \{b, c, d\}$ .

Cycle?

The path $a - f_B - b - f_D - c - f_C - a$ closes a cycle. Not a tree. This is the classical "V-structure" (common child) causing a loop in the factor graph.

ch17-ex10

Medium

Show that loopy BP on a tree is equivalent to one forward pass plus one backward pass — iterating further does nothing.

Show Hint

After both passes, the messages are at their fixed-point values.

Solution

Pass structure

Leaves have trivial incoming messages; their outgoing messages depend only on themselves. After the first pass (outward from leaves to root), all messages from leaves toward root are set.

Backward pass

After root-to-leaves pass, all messages from root toward leaves are set. Now every directed edge has a message value.

Stability

The next iteration recomputes each message using its inputs — which are unchanged. So iterate once more and nothing changes. A fixed point is reached in two passes.

ch17-ex11

Medium

Argue why loopy BP on a graph with girth $g$ is equivalent to exact inference on a tree for the first $g/2$ iterations.

Show Hint

Information travels one hop per iteration.

A cycle has length $g$ , so information returns to origin after $g/2$ iterations.

Solution

Depth of influence

After $t$ iterations, message $\mu_{a \to i}$ depends on observations within a $t$ -hop neighborhood of $i$ . If $t < g/2$ , this neighborhood is a tree (no cycle fits).

Tree interpretation

The computation is identical to sum-product on the $t$ -hop unrolled tree around $i$ . For $t < g/2$ iterations, loopy BP = exact BP on a finite tree.

Consequence

Loopy BP makes its first mistake only after $\lceil g/2 \rceil$ iterations. Large girth delays the onset of approximation errors.

ch17-ex12

Hard

For a $(d_v, d_c)$ -regular Tanner graph with girth $\geq 2t$ , prove that the depth- $t$ neighborhood of any variable node is a tree with $d_v (d_c - 1)^{t-1}$ variable nodes at depth $t$ .

Show Hint

Count children at each level of a branching process.

Solution

Level 0: origin

1 variable node at depth 0.

Level 1: neighbors

$d_v$ check nodes adjacent to the origin. Each check has $d_c - 1$ other variable neighbors: $d_v (d_c - 1)$ variables at depth 2.

General level

Each variable at depth $2k$ has $d_v - 1$ new check neighbors (one check is the parent), each leading to $d_c - 1$ new variable nodes. At depth $2t$ : $d_v (d_c - 1)^{t} (d_v - 1)^{t-1}$ ... actually let me redo. Number of variables at depth $2k$ is $d_v (d_c - 1) \prod_{j=1}^{k-1}(d_v - 1)(d_c - 1) = d_v (d_c - 1)^k (d_v - 1)^{k-1}$ .

Tree property

Girth $\geq 2t$ ensures no two paths of length $< t$ from origin meet, so the enumeration is exact (no double-counting).

ch17-ex13

Medium

Give an example of a factor graph where loopy BP has multiple fixed points. What does this imply about convergence behavior?

Show Hint

Strong coupling can create bistability.

Consider a binary Ising ring with negative coupling.

Solution

Construction

A ring of $n$ binary variables with pairwise factor $f(x_i, x_{i+1}) = \exp(-\theta x_i x_{i+1})$ (antiferromagnetic, $\theta > 0$ ). For $n$ odd, BP has multiple fixed points by frustration.

Implication

With multiple fixed points, different initializations lead to different outputs. Convergence is not guaranteed to the "best" fixed point; damping and random restarts may be needed.

Diagnostic

In practice, check BP by running from multiple initializations and comparing outputs. Divergence between runs is a red flag.

ch17-ex14

Easy

The Bethe free energy $\mathcal{F}_{\text{Bethe}}$ equals the exact Gibbs free energy $\mathcal{F}_{\text{exact}}$ on which class of factor graphs?

Show Hint

On trees.

Solution

Answer

On trees. The Bethe approximation treats factor marginals as if they were independent except for the consistency constraints. On a tree, the global distribution literally factorizes pairwise in this way. On loops, there is residual dependence not captured.

ch17-ex15

Hard

A code with parity-check matrix has 4 variable nodes and 2 check nodes, each check of degree 3 (so $\mathbf{H}$ is $2 \times 4$ with three 1s per row). What is the maximum possible girth? Construct an $\mathbf{H}$ achieving it.

Show Hint

Short cycles: girth 4 requires two variables in both checks.

Girth 6 forbids such sharing.

Solution

Girth 4 requires

Two rows sharing $\geq 2$ column indices. Since each row has 3 ones out of 4 columns, and $\binom{4}{3} = 4$ , two rows differ in exactly one position. Hence they share 2 columns — always. So girth is exactly 4 for any such code.

Conclusion

Maximum girth is 4 for this size. To avoid 4-cycles we would need either more variable nodes ( $n \geq 6$ ) or smaller check degree.

ch17-ex16

Medium

Convert the posterior of the Kalman filter problem $x_{t+1} = A x_t + v_t, y_t = C x_t + w_t$ to a factor graph, and identify the message-passing cost per time step.

Show Hint

State transition and observation factors.

Solution

Factor graph

Variable nodes: $x_1, x_2, \ldots, x_T$ (continuous). Factors:

$f_t(x_t, x_{t+1}) = \mathcal{N}(x_{t+1}; Ax_t, Q)$ — transitions.
$g_t(x_t) = \mathcal{N}(y_t; Cx_t, R)$ — observations (clamped).

Structure

Chain of state variables with transition factors and unary observation factors. Tree-width 1.

Per-step cost

Each message is a Gaussian (mean + covariance of dimension $d$ ). Message update involves matrix multiplies and inversions: $O(d^3)$ per step. Total $O(Td^3)$ — the Kalman filter complexity.

ch17-ex17

Medium

What is the relation between a factor graph with only pairwise factors and a Markov random field?

Show Hint

Both represent pairwise interactions.

Solution

Direct correspondence

A pairwise factor graph converts to a Markov random field with one edge per pairwise factor. Conversely, every pairwise MRF has a unique factor graph (one factor per edge). The graphs are isomorphic after collapsing factor nodes.

Why factor graphs are still useful

Pairwise MRFs are a restricted family. Real-world models with ternary or higher-arity factors (e.g., parity checks) are more naturally expressed as factor graphs.

ch17-ex18

Hard

Show that flooding loopy BP and serial loopy BP have the same fixed points, but may have different convergence rates and stability.

Show Hint

Fixed points are characterized algebraically, not by update order.

Solution

Fixed points

A fixed point satisfies $\mu = T(\mu)$ for the message update operator $T$ . This condition is independent of update order.

Dynamics differ

Flooding: all messages update simultaneously, $\mu^{(t+1)} = T(\mu^{(t)})$ . Serial: messages update one at a time in some order. Jacobian of the update differs: serial can converge when flooding oscillates, because it uses freshly updated messages.

Practical takeaway

Serial scheduling typically converges faster and more reliably. The standard LDPC decoder uses a layered (block serial) schedule.

ch17-ex19

Easy

Name three problem domains where factor graphs are the standard framework for inference.

Show Hint

Coding, signal processing, statistics/ML.

Solution

Examples

(1) Channel decoding (LDPC, turbo): Tanner graph + loopy BP. (2) Signal processing (Kalman filter, HMM): chain graphs + forward-backward. (3) Machine learning (Bayesian networks, MRFs): message passing, variational inference, sampling. Also: robotics (SLAM), statistical physics (Ising models), compressed sensing.

ch17-ex20

Challenge

Consider a cluster graph obtained by grouping variables in a factor graph into overlapping clusters. Explain how generalized belief propagation (Yedidia et al.) uses a region graph to improve over loopy BP.

Show Hint

Higher-order corrections via inclusion-exclusion.

Cluster = super-node; region = subgraph.

Solution

Region graph

Choose overlapping regions (clusters) that cover each factor. Form a region graph: larger regions at top, smaller (intersections) below.

Generalized BP

Pass messages between regions. The resulting fixed points minimize a Kikuchi free energy, a generalization of Bethe that accounts for multi-node correlations.

Tradeoff

Larger regions → better approximation, but exponential cost in region size. Generalized BP bridges loopy BP (regions = factors) and exact inference (regions = whole graph).

Exercises

ch17-ex01

Neighborhoods

Tree check

ch17-ex02

Draw

Cost

ch17-ex03

Transition kernel

Forward message at $x_2$

Backward message at $x_2$

Combine and normalize

ch17-ex04

Frontier argument

Exact cost

ch17-ex05

List the 1s per row

Find shared pairs

Conclusion

ch17-ex06

Factor graph

Topology and tree-width

ch17-ex07

Induction hypothesis

Root at $i$

Factorization

ch17-ex08

Check

Cyclomatic number

ch17-ex09

Identify factors

Neighborhoods

Cycle?

ch17-ex10

Pass structure

Backward pass

Stability

ch17-ex11

Depth of influence

Tree interpretation

Consequence

ch17-ex12

Level 0: origin

Level 1: neighbors

General level

Tree property

ch17-ex13

Construction

Implication

Diagnostic

ch17-ex14

Answer

ch17-ex15

Girth 4 requires

Conclusion

ch17-ex16

Factor graph

Structure

Per-step cost

ch17-ex17

Direct correspondence

Why factor graphs are still useful

ch17-ex18

Fixed points

Dynamics differ

Practical takeaway

ch17-ex19

Examples

ch17-ex20

Region graph

Generalized BP

Tradeoff