NP-Hardness and Practical Heuristics

Why Index Coding Is Hard

Having framed index coding as a graph problem, we inherit the graph problem's computational complexity. Computing the chromatic number is NP-hard; computing the linear broadcast rate (minrank) is also NP-hard (Peeters 1996). For general instances, there is no polynomial-time algorithm that finds the optimal index code β€” even non-optimal schemes that come close require heuristics.

This is why coded caching is not "just" applied index coding: the MAN scheme's secret sauce is that the combinatorial placement makes the resulting graph structured, and for structured graphs the bounds coincide and the optimal code is explicit. Without this structure, we would be stuck with heuristics.

Theorem: NP-Hardness of Linear Index Coding

Given an index coding instance I\mathcal{I} and a target rate Rβˆ—R^*, deciding whether Ξ»(I)≀Rβˆ—\lambda(\mathcal{I}) \leq R^* (the linear broadcast rate is at most Rβˆ—R^*) is NP-complete.

Minrank of a graph is a matrix-theoretic invariant that generalizes chromatic number. Chromatic number is NP-hard (Garey & Johnson '79), and minrank is harder still in general fields. Reduction from graph 3-coloring yields NP-completeness for minrank decision, hence for index coding.

,

Greedy Coloring Heuristic for Index Coding

Complexity: O(∣Vβˆ£β‹…Ξ”(G))O(|V| \cdot \Delta(G)) where Ξ”(G)\Delta(G) is the maximum degree.
Input: Conflict graph G=(V,E)G = (V, E), ordering Οƒ\sigma of vertices.
Output: A coloring c:Vβ†’{0,1,…}c: V \to \{0, 1, \ldots\}, which defines a valid
index code of rate ∣c(V)∣|c(V)| (number of colors used).
1. Initialize c(v)←undefinedc(v) \leftarrow \text{undefined} for all v∈Vv \in V.
2. for each vertex vv in ordering Οƒ\sigma do
3. \quad Compute used(v)←{c(u):u∈N(v),c(u)β‰ undefined}\text{used}(v) \leftarrow \{c(u) : u \in N(v), c(u) \neq \text{undefined}\}
4. c(v)←min⁑{iβ‰₯0:iβˆ‰used(v)}\quad c(v) \leftarrow \min \{i \geq 0 : i \notin \text{used}(v)\}
5. end for
6. return cc.

Greedy coloring is a classical heuristic. It does not achieve the chromatic number in general β€” worst case, it uses Ξ”(G)+1\Delta(G) + 1 colors. For MAN conflict graphs, a specific vertex ordering yields the optimal coloring with (Kβˆ’t)/(t+1)(K-t)/(t+1) colors.

Greedy Coloring on a MAN Instance

Apply greedy coloring to the MAN conflict graph for small K,tK, t. Each color corresponds to one coded multicast message in the delivery phase. The number of colors equals the delivery rate in file-size units of 1/(Kt)1/\binom{K}{t}.

Parameters
4
1

Definition:

Clique Cover Bound

A clique cover of GG is a partition of VV into cliques. The minimum clique cover number χˉ(G)=χ(Gˉ)\bar{\chi}(G) = \chi(\bar{G}) equals the chromatic number of the complement Gˉ\bar{G}. For index coding, a clique cover gives a valid scheme: each clique becomes one coded transmission (all pairs in a clique are compatible for XOR).

The fractional clique cover number χˉf(G)\bar{\chi}_f(G) is the LP relaxation, and χˉf=χf(Gˉ)\bar{\chi}_f = \chi_f(\bar{G}).

For MAN instances, the conflict graph is vertex-transitive, and fractional clique cover equals the rate. This structural regularity is what makes coded caching tractable.

Theorem: LP Relaxation Bound

For any index coding instance with conflict graph GG, Ξ±(G)β€…β€Šβ‰€β€…β€ŠΞ²(G)β€…β€Šβ‰€β€…β€ŠΟ‡f(G)β€…β€Š=β€…β€ŠLP-OPT(G),\alpha(G) \;\leq\; \beta(G) \;\leq\; \chi_f(G) \;=\; \text{LP-OPT}(G), where LP-OPT is the optimum of the fractional chromatic number LP.

The LP bound is the strongest polynomial-time computable upper bound on the broadcast rate via XOR coding. For some families it is tight, for others it has an Ω(log⁑n)\Omega(\log n) gap to β\beta (Alon et al. 2008 via Kneser graphs).

MAN Rate vs LP Bound vs Uncoded

Compare, as a function of KK, the MAN integer-tt envelope rate with the LP lower bound K(1βˆ’ΞΌ)/(1+KΞΌ)K(1-\mu)/(1+K\mu) (the continuous optimum, matching the memory-sharing lower envelope) and the uncoded upper bound K(1βˆ’ΞΌ)K(1-\mu). The MAN rate is at most one more color than the LP bound β€” the integer-tt rounding gap.

Parameters
0.2
40
⚠️Engineering Note

What Deployed Systems Use

Real systems that exploit coded multicasting use one of three strategies:

  1. MAN-style placement + closed-form delivery. The approach of this book: design the placement so the delivery is pre-solved. Deployable; limited by subpacketization (Ch 14).
  2. Greedy coloring at runtime. Heuristic; near-optimal in practice for small instances; does not scale to hundreds of users.
  3. PDA / linear-code-based schemes (Ch 14). Polynomial subpacketization, approximate LP bound. Deployable at moderate KK.

Few production systems attempt to solve index coding online. The value of the index coding framework is conceptual: it places coded caching within a larger family and lets us import bounds and intuition from a well-studied literature.

Practical Constraints
  • β€’

    Optimal index coding is NP-hard; runtime solution infeasible for K > 30

  • β€’

    Greedy coloring gives near-optimal rates for random instances

  • β€’

    MAN schemes pre-solve the index coding problem via combinatorial placement

Common Mistake: Greedy Coloring Is Not Always Optimal

Mistake:

Expecting greedy coloring to achieve the chromatic number.

Correction:

Greedy coloring's output depends on the vertex ordering, and in the worst case it uses up to Ξ”(G)+1\Delta(G) + 1 colors β€” which can be far from Ο‡(G)\chi(G). For specific orderings (Welsh–Powell, degree of saturation), greedy performs well, but the problem is still NP-hard. For MAN conflict graphs the natural ordering (by subset lexicographic order) gives the optimal (Kβˆ’t)/(t+1)(K-t)/(t+1) colors β€” but this is a happy accident of the combinatorial structure, not a general property of greedy.