Multi-Source Multi-Sink Networks
Beyond Single-Source Multicast
The network coding theorem gives a clean, single-letter characterization of the multicast capacity for a single source. But real networks have multiple sources sending different information to different destinations. What is the capacity region of a general multi-source multi-sink network?
The short answer: we do not know, in general. The multi-source problem is dramatically harder than the single-source case, and the characterization of the rate region remains one of the major open problems in network information theory. In this section, we survey what is known and what remains open.
Definition: Multi-Source Multi-Sink Network Coding Problem
Multi-Source Multi-Sink Network Coding Problem
A multi-source multi-sink network coding problem on a directed acyclic graph consists of:
- Multiple source nodes , each generating an independent message at rate
- Multiple sink nodes , where sink demands a subset of the messages
The rate region is the set of rate tuples for which all sinks can reliably recover their demanded messages using network coding at intermediate nodes.
Special cases:
- Multiple unicast: each source has a dedicated sink (, )
- Single-source multicast: , for all (solved by ACLY)
- Multiple multicast: each source multicasts to all sinks
Theorem: Cut-Set Bound for Multi-Source Networks
For a multi-source multi-sink network, the rate region is contained in the set of rate tuples satisfying: for every cut separating a subset of sources from the sinks that demand their messages,
This is the network generalization of the max-flow min-cut bound.
The cut-set bound says that the total rate of information that needs to cross any cut cannot exceed the cut capacity. For single-source multicast, this bound is tight (the ACLY theorem achieves it). For multi-source problems, the cut-set bound is generally not tight β there are examples where the achievable rate region is strictly smaller than the cut-set bound.
Proof sketch
Consider any cut . The messages from sources in that are demanded by sinks in must traverse the cut. The total rate of these messages is bounded by the information that can cross the cut, which is at most symbols per time unit.
Formally, this follows from the independence of the source messages and the capacity constraint on each edge.
Example: Two-Unicast Network: When Cut-Set Is Not Tight
Consider a network with two sources and two sinks , where wants only. The network is:
(cap 1), (cap 1), (cap 1), (cap 1).
The cut-set bound allows (cut at the sources) and , (individual cuts). Can we achieve ?
Analysis
Node receives both and and has two outgoing edges, each of capacity 1. To deliver to and to , node must send on the edge to and on the edge to .
This works β is achievable by simple routing.
But now consider a different topology where the paths to and share a bottleneck edge of capacity 1. Then node can only send one symbol per time unit on the shared link, and it must choose between and (or a coded combination). If wants only and wants only , then network coding (sending ) does not help β cannot recover from without knowing .
Key insight
Network coding helps for multicast (same data to all sinks) but does not necessarily help for multiple unicast (different data to different sinks). For the two-unicast problem, the capacity region depends on the specific topology, and the cut-set bound is not always achievable.
This is fundamentally different from the single-source case and is one reason why the general multi-source problem remains open.
Open Problems in Network Coding
Several fundamental questions remain unresolved:
-
Multiple unicast capacity: The capacity region of the two-unicast problem (two source-sink pairs) over a general network is unknown. This is closely related to the interference channel capacity problem (Chapter 17).
-
Insufficiency of linear codes: For multiple unicast, linear codes over any finite field may not achieve the capacity. The Matroid port construction of Dougherty, Freiling, and Zeger (2005) gave the first example where nonlinear codes strictly outperform linear codes.
-
Non-Shannon inequalities: The cut-set bound uses only Shannon-type information inequalities (, , submodularity). There exist non-Shannon inequalities (ZhangβYeung, 1998) that provide tighter outer bounds for some networks. Whether these are ever needed to characterize the capacity is an active research question.
-
Index coding: The index coding problem (a special case of network coding where there is a single bottleneck link and each receiver has side information) is equivalent to the general network coding problem. Despite its seeming simplicity, the index coding capacity is unknown in general.
Historical Note: The Evolution of Network Coding (2000β2020)
2000β2020The field of network coding evolved rapidly after the ACLY paper of 2000:
-
2003: Li, Yeung, and Cai proved that linear codes suffice for single-source multicast, reducing the problem to linear algebra.
-
2003: Koetter and MΓ©dard developed the algebraic framework that unified linear network coding with classical coding theory.
-
2006: Ho, MΓ©dard, Koetter, et al. showed that random linear network coding works with high probability, eliminating the need for centralized code design.
-
2005: Dougherty, Freiling, and Zeger showed that linear codes are insufficient for general multi-source problems β a surprising negative result.
-
2008: Katti et al. (COPE) demonstrated practical gains from network coding in wireless mesh networks, inspiring the "physical-layer network coding" direction.
-
2010s: The connection between network coding and coded caching (Maddah-Ali and Niesen, 2014) opened a new chapter, with coded multicast as the bridge between the two fields.
Today, network coding is a mature theory with practical impact in content distribution, wireless relaying, and distributed storage.
Definition: Index Coding
Index Coding
The index coding problem is a special case of network coding with a single broadcast link: a server has messages and broadcasts to receivers over a shared noiseless link. Receiver wants and has a subset as side information.
The optimal broadcast rate is the minimum number of transmissions (from ) needed to satisfy all receivers. Finding is NP-hard in general.
Index coding is equivalent to network coding: any network coding problem can be reduced to an index coding problem and vice versa (El Rouayheb and Sprintson, 2010). This equivalence means that understanding index coding is both necessary and sufficient for understanding general network coding.
The side information graph β where there is a directed edge from receiver to receiver if receiver has as side information β encodes the structure of the problem. The clique cover number of this graph gives a simple (but generally loose) upper bound on .
Index coding
A network coding problem where a single server broadcasts coded messages to multiple receivers, each with different side information. The goal is to minimize the number of transmissions. Equivalent to the general network coding problem.
Related: Network code
Multicast capacity
The maximum rate at which a single source can deliver the same information to all sinks in a network, equal to and achievable by linear network coding.
Butterfly Network: Routing vs. Network Coding
Visualize the butterfly network and compare routing-only solutions with network coding. Toggle between different routing strategies and the XOR coding strategy to see how network coding achieves the multicast capacity where routing falls short.
Parameters
Routing strategy or network coding
Network Coding Throughput Gain vs. Network Size
Compare the multicast throughput of routing vs. network coding for random networks of varying size. The gain of network coding grows with network density and the number of sinks.
Parameters
Probability of each edge existing in the random graph
Quick Check
For the two-unicast problem (two sources, two sinks, each wanting a different message), does network coding always achieve the cut-set bound?
No β the cut-set bound is not always tight for multiple unicast, and network coding may not close the gap
Yes β the ACLY theorem applies to any source-sink configuration
Yes, but only over sufficiently large finite fields
Unlike single-source multicast, the cut-set bound for multiple unicast is not always achievable. Network coding (even nonlinear) cannot always achieve the cut-set bound for this problem. The gap between the achievable region and the cut-set bound depends on the network topology and is related to the open problem of interference channel capacity.
Computational Complexity of Random Linear Network Coding
The main computational bottleneck of RLNC is decoding at the sinks, which requires Gaussian elimination over on an matrix:
- Encoding at each node: operations per outgoing symbol (one linear combination of incoming symbols over )
- Decoding at each sink: operations per source symbol (Gaussian elimination on an matrix, amortized over symbols)
- Header overhead: bits per packet
For practical parameters (, ), encoding adds ~10 XOR-like operations per packet, decoding requires a GF(256) matrix inversion, and the header is 10 bytes. These costs are acceptable for most applications.
For real-time streaming, the main concern is decoding delay: the sink must collect linearly independent coded packets before decoding any source symbol. This adds a delay of packet intervals. Sliding-window RLNC reduces this delay by allowing partial decoding as packets arrive.
- β’
Decoding delay: packet intervals for block RLNC
- β’
GF(256) arithmetic: supported by hardware AES-NI instructions on modern CPUs
- β’
Memory: sink buffers field elements for Gaussian elimination
Common Mistake: Network Coding Helps Unicast
Mistake:
Claiming that network coding can increase the throughput of unicast (single source, single sink) over a noiseless network.
Correction:
For unicast over noiseless networks, routing achieves the max-flow (FordβFulkerson theorem). Network coding does not increase the unicast throughput. Its advantage appears only for multicast and multi-session problems. Over noisy networks (Chapters 22β23), network coding can help even for unicast, because coding at relays can provide coding gain against noise.
Why This Matters: Physical-Layer Network Coding in Wireless Networks
In wireless networks, simultaneous transmissions naturally superimpose at receivers β a property traditionally viewed as interference. Physical-layer network coding (PLNC) turns this "bug" into a "feature": instead of avoiding simultaneous transmissions, PLNC lets them interfere constructively.
In the simplest two-way relay scenario, nodes A and B exchange messages through a relay R. With traditional routing, this requires 4 time slots (AβR, RβB, BβR, RβA). With network coding at R (XOR and broadcast), it takes 3 slots. With PLNC (A and B transmit simultaneously, R decodes the XOR from the superimposed signals), it takes only 2 slots β a 2x improvement over routing.
This principle underlies the compute-and-forward framework (Chapter 22) and is one of the bridges between network coding and physical-layer design.
Key Takeaway
The multi-source multi-sink network coding problem remains largely open. The cut-set bound provides an outer bound but is not always tight. Linear codes are insufficient for some multi-source problems, and the capacity region of even the two-unicast problem is unknown. The connection to index coding provides a bridge to combinatorial optimization, while the link to interference channels connects to the deep open problems of multiuser information theory.