Multi-Source Multi-Sink Networks

Beyond Single-Source Multicast

The network coding theorem gives a clean, single-letter characterization of the multicast capacity for a single source. But real networks have multiple sources sending different information to different destinations. What is the capacity region of a general multi-source multi-sink network?

The short answer: we do not know, in general. The multi-source problem is dramatically harder than the single-source case, and the characterization of the rate region remains one of the major open problems in network information theory. In this section, we survey what is known and what remains open.

Definition:

Multi-Source Multi-Sink Network Coding Problem

A multi-source multi-sink network coding problem on a directed acyclic graph G=(V,E)G = (V, E) consists of:

  • Multiple source nodes s1,…,sLs_1, \ldots, s_L, each generating an independent message WlW_l at rate RlR_l
  • Multiple sink nodes t1,…,tKt_1, \ldots, t_K, where sink tkt_k demands a subset DkβŠ†{W1,…,WL}\mathcal{D}_k \subseteq \{W_1, \ldots, W_L\} of the messages

The rate region is the set of rate tuples (R1,…,RL)(R_1, \ldots, R_L) for which all sinks can reliably recover their demanded messages using network coding at intermediate nodes.

Special cases:

  • Multiple unicast: each source has a dedicated sink (L=KL = K, Dk={Wk}\mathcal{D}_k = \{W_k\})
  • Single-source multicast: L=1L = 1, Dk={W1}\mathcal{D}_k = \{W_1\} for all kk (solved by ACLY)
  • Multiple multicast: each source multicasts to all sinks

Theorem: Cut-Set Bound for Multi-Source Networks

For a multi-source multi-sink network, the rate region is contained in the set of rate tuples (R1,…,RL)(R_1, \ldots, R_L) satisfying: for every cut (S,Sc)(S, S^c) separating a subset of sources from the sinks that demand their messages, βˆ‘l:sl∈S,β€‰βˆƒk:tk∈Sc,Wl∈DkRl≀cap(S,Sc).\sum_{l: s_l \in S,\, \exists k: t_k \in S^c, W_l \in \mathcal{D}_k} R_l \leq \text{cap}(S, S^c).

This is the network generalization of the max-flow min-cut bound.

The cut-set bound says that the total rate of information that needs to cross any cut cannot exceed the cut capacity. For single-source multicast, this bound is tight (the ACLY theorem achieves it). For multi-source problems, the cut-set bound is generally not tight β€” there are examples where the achievable rate region is strictly smaller than the cut-set bound.

Example: Two-Unicast Network: When Cut-Set Is Not Tight

Consider a network with two sources s1,s2s_1, s_2 and two sinks t1,t2t_1, t_2, where tkt_k wants WkW_k only. The network is:

s1→as_1 \to a (cap 1), s2→as_2 \to a (cap 1), a→t1a \to t_1 (cap 1), a→t2a \to t_2 (cap 1).

The cut-set bound allows R1+R2≀2R_1 + R_2 \leq 2 (cut at the sources) and R1≀1R_1 \leq 1, R2≀1R_2 \leq 1 (individual cuts). Can we achieve R1=R2=1R_1 = R_2 = 1?

Open Problems in Network Coding

Several fundamental questions remain unresolved:

  1. Multiple unicast capacity: The capacity region of the two-unicast problem (two source-sink pairs) over a general network is unknown. This is closely related to the interference channel capacity problem (Chapter 17).

  2. Insufficiency of linear codes: For multiple unicast, linear codes over any finite field may not achieve the capacity. The Matroid port construction of Dougherty, Freiling, and Zeger (2005) gave the first example where nonlinear codes strictly outperform linear codes.

  3. Non-Shannon inequalities: The cut-set bound uses only Shannon-type information inequalities (HH, II, submodularity). There exist non-Shannon inequalities (Zhang–Yeung, 1998) that provide tighter outer bounds for some networks. Whether these are ever needed to characterize the capacity is an active research question.

  4. Index coding: The index coding problem (a special case of network coding where there is a single bottleneck link and each receiver has side information) is equivalent to the general network coding problem. Despite its seeming simplicity, the index coding capacity is unknown in general.

Historical Note: The Evolution of Network Coding (2000–2020)

2000–2020

The field of network coding evolved rapidly after the ACLY paper of 2000:

  • 2003: Li, Yeung, and Cai proved that linear codes suffice for single-source multicast, reducing the problem to linear algebra.

  • 2003: Koetter and MΓ©dard developed the algebraic framework that unified linear network coding with classical coding theory.

  • 2006: Ho, MΓ©dard, Koetter, et al. showed that random linear network coding works with high probability, eliminating the need for centralized code design.

  • 2005: Dougherty, Freiling, and Zeger showed that linear codes are insufficient for general multi-source problems β€” a surprising negative result.

  • 2008: Katti et al. (COPE) demonstrated practical gains from network coding in wireless mesh networks, inspiring the "physical-layer network coding" direction.

  • 2010s: The connection between network coding and coded caching (Maddah-Ali and Niesen, 2014) opened a new chapter, with coded multicast as the bridge between the two fields.

Today, network coding is a mature theory with practical impact in content distribution, wireless relaying, and distributed storage.

Definition:

Index Coding

The index coding problem is a special case of network coding with a single broadcast link: a server has nn messages W1,…,WnW_1, \ldots, W_n and broadcasts to nn receivers over a shared noiseless link. Receiver kk wants WkW_k and has a subset AkβŠ‚{W1,…,Wn}βˆ–{Wk}\mathcal{A}_k \subset \{W_1, \ldots, W_n\} \setminus \{W_k\} as side information.

The optimal broadcast rate Ξ²\beta is the minimum number of transmissions (from Fq\mathbb{F}_q) needed to satisfy all receivers. Finding Ξ²\beta is NP-hard in general.

Index coding is equivalent to network coding: any network coding problem can be reduced to an index coding problem and vice versa (El Rouayheb and Sprintson, 2010). This equivalence means that understanding index coding is both necessary and sufficient for understanding general network coding.

The side information graph β€” where there is a directed edge from receiver ii to receiver jj if receiver jj has WiW_i as side information β€” encodes the structure of the problem. The clique cover number of this graph gives a simple (but generally loose) upper bound on Ξ²\beta.

Index coding

A network coding problem where a single server broadcasts coded messages to multiple receivers, each with different side information. The goal is to minimize the number of transmissions. Equivalent to the general network coding problem.

Related: Network code

Multicast capacity

The maximum rate at which a single source can deliver the same information to all sinks in a network, equal to min⁑kmincut(s,tk)\min_k \text{mincut}(s, t_k) and achievable by linear network coding.

Butterfly Network: Routing vs. Network Coding

Visualize the butterfly network and compare routing-only solutions with network coding. Toggle between different routing strategies and the XOR coding strategy to see how network coding achieves the multicast capacity where routing falls short.

Parameters

Routing strategy or network coding

2

Network Coding Throughput Gain vs. Network Size

Compare the multicast throughput of routing vs. network coding for random networks of varying size. The gain of network coding grows with network density and the number of sinks.

Parameters
15
5
0.3

Probability of each edge existing in the random graph

Quick Check

For the two-unicast problem (two sources, two sinks, each wanting a different message), does network coding always achieve the cut-set bound?

No β€” the cut-set bound is not always tight for multiple unicast, and network coding may not close the gap

Yes β€” the ACLY theorem applies to any source-sink configuration

Yes, but only over sufficiently large finite fields

πŸ”§Engineering Note

Computational Complexity of Random Linear Network Coding

The main computational bottleneck of RLNC is decoding at the sinks, which requires Gaussian elimination over Fq\mathbb{F}_q on an hΓ—hh \times h matrix:

  • Encoding at each node: O(h)O(h) operations per outgoing symbol (one linear combination of hh incoming symbols over Fq\mathbb{F}_q)
  • Decoding at each sink: O(h2)O(h^2) operations per source symbol (Gaussian elimination on an hΓ—hh \times h matrix, amortized over hh symbols)
  • Header overhead: h⌈log⁑2qβŒ‰h \lceil\log_2 q\rceil bits per packet

For practical parameters (h=10h = 10, q=256q = 256), encoding adds ~10 XOR-like operations per packet, decoding requires a 10Γ—1010 \times 10 GF(256) matrix inversion, and the header is 10 bytes. These costs are acceptable for most applications.

For real-time streaming, the main concern is decoding delay: the sink must collect hh linearly independent coded packets before decoding any source symbol. This adds a delay of hh packet intervals. Sliding-window RLNC reduces this delay by allowing partial decoding as packets arrive.

Practical Constraints
  • β€’

    Decoding delay: hh packet intervals for block RLNC

  • β€’

    GF(256) arithmetic: supported by hardware AES-NI instructions on modern CPUs

  • β€’

    Memory: sink buffers O(h2)O(h^2) field elements for Gaussian elimination

Common Mistake: Network Coding Helps Unicast

Mistake:

Claiming that network coding can increase the throughput of unicast (single source, single sink) over a noiseless network.

Correction:

For unicast over noiseless networks, routing achieves the max-flow (Ford–Fulkerson theorem). Network coding does not increase the unicast throughput. Its advantage appears only for multicast and multi-session problems. Over noisy networks (Chapters 22–23), network coding can help even for unicast, because coding at relays can provide coding gain against noise.

Why This Matters: Physical-Layer Network Coding in Wireless Networks

In wireless networks, simultaneous transmissions naturally superimpose at receivers β€” a property traditionally viewed as interference. Physical-layer network coding (PLNC) turns this "bug" into a "feature": instead of avoiding simultaneous transmissions, PLNC lets them interfere constructively.

In the simplest two-way relay scenario, nodes A and B exchange messages through a relay R. With traditional routing, this requires 4 time slots (A→R, R→B, B→R, R→A). With network coding at R (XOR and broadcast), it takes 3 slots. With PLNC (A and B transmit simultaneously, R decodes the XOR from the superimposed signals), it takes only 2 slots — a 2x improvement over routing.

This principle underlies the compute-and-forward framework (Chapter 22) and is one of the bridges between network coding and physical-layer design.

Key Takeaway

The multi-source multi-sink network coding problem remains largely open. The cut-set bound provides an outer bound but is not always tight. Linear codes are insufficient for some multi-source problems, and the capacity region of even the two-unicast problem is unknown. The connection to index coding provides a bridge to combinatorial optimization, while the link to interference channels connects to the deep open problems of multiuser information theory.