Ferkans — Interactive Telecom Tutor

Why Index Coding Is Hard

Having framed index coding as a graph problem, we inherit the graph problem's computational complexity. Computing the chromatic number is NP-hard; computing the linear broadcast rate (minrank) is also NP-hard (Peeters 1996). For general instances, there is no polynomial-time algorithm that finds the optimal index code — even non-optimal schemes that come close require heuristics.

This is why coded caching is not "just" applied index coding: the MAN scheme's secret sauce is that the combinatorial placement makes the resulting graph structured, and for structured graphs the bounds coincide and the optimal code is explicit. Without this structure, we would be stuck with heuristics.

Theorem: NP-Hardness of Linear Index Coding

Given an index coding instance $\mathcal{I}$ and a target rate $R^*$ , deciding whether $\lambda(\mathcal{I}) \leq R^*$ (the linear broadcast rate is at most $R^*$ ) is NP-complete.

Minrank of a graph is a matrix-theoretic invariant that generalizes chromatic number. Chromatic number is NP-hard (Garey & Johnson '79), and minrank is harder still in general fields. Reduction from graph 3-coloring yields NP-completeness for minrank decision, hence for index coding.

Proof

Reduce chromatic number to minrank

Peeters (1996) showed that for a graph $G$ , $\text{minrk}(G) \leq \chi(G)$ , and the reduction preserves enough structure that determining minrank $(G) \leq k$ is NP-hard.

Connect to index coding

Bar-Yossef et al. (2011) showed that the linear index coding rate equals minrank. Hence deciding $\lambda(\mathcal{I}) \leq R^*$ inherits NP-hardness.

Non-linear rate

The non-linear (general) broadcast rate $\beta(\mathcal{I})$ is also NP-hard to compute. In fact, it is not even known to be computable by a finite algorithm — the existence of an algorithm that computes $\beta$ in finite time is open (as of 2026).

Consequence

For general index coding we are stuck with heuristics and bounds. Structured instances — the MAN family — admit polynomial-time optimal solutions. $\blacksquare$

,

Greedy Coloring Heuristic for Index Coding

Complexity:

O(|V| \cdot \Delta(G))

where

\Delta(G)

is the maximum degree.

Input: Conflict graph

G = (V, E)

, ordering

\sigma

of vertices.

Output: A coloring

c: V \to \{0, 1, \ldots\}

, which defines a valid

index code of rate

|c(V)|

(number of colors used).

1. Initialize

c(v) \leftarrow \text{undefined}

for all

v \in V

.

2. for each vertex

v

in ordering

\sigma

do

3.

\quad

Compute

\text{used}(v) \leftarrow \{c(u) : u \in N(v), c(u) \neq \text{undefined}\}

4.

\quad c(v) \leftarrow \min \{i \geq 0 : i \notin \text{used}(v)\}

5. end for

6. return

c

.

Greedy coloring is a classical heuristic. It does not achieve the chromatic number in general — worst case, it uses $\Delta(G) + 1$ colors. For MAN conflict graphs, a specific vertex ordering yields the optimal coloring with $(K-t)/(t+1)$ colors.

Greedy Coloring on a MAN Instance

Apply greedy coloring to the MAN conflict graph for small $K, t$ . Each color corresponds to one coded multicast message in the delivery phase. The number of colors equals the delivery rate in file-size units of $1/\binom{K}{t}$ .

Parameters

Number of users K4

Gain parameter t1

Definition:
Clique Cover Bound

A clique cover of $G$ is a partition of $V$ into cliques. The minimum clique cover number $\bar{\chi}(G) = \chi(\bar{G})$ equals the chromatic number of the complement $\bar{G}$ . For index coding, a clique cover gives a valid scheme: each clique becomes one coded transmission (all pairs in a clique are compatible for XOR).

The fractional clique cover number $\bar{\chi}_f(G)$ is the LP relaxation, and $\bar{\chi}_f = \chi_f(\bar{G})$ .

For MAN instances, the conflict graph is vertex-transitive, and fractional clique cover equals the rate. This structural regularity is what makes coded caching tractable.

Theorem: LP Relaxation Bound

For any index coding instance with conflict graph $G$ , $\alpha(G) \;\leq\; \beta(G) \;\leq\; \chi_f(G) \;=\; \text{LP-OPT}(G),$ where LP-OPT is the optimum of the fractional chromatic number LP.

The LP bound is the strongest polynomial-time computable upper bound on the broadcast rate via XOR coding. For some families it is tight, for others it has an $\Omega(\log n)$ gap to $\beta$ (Alon et al. 2008 via Kneser graphs).

Proof

LP formulation

$\chi_f(G) = \min \sum_I x_I$ subject to $\sum_{I \ni v} x_I \geq 1$ for all $v$ and $x_I \geq 0$ , where $I$ ranges over independent sets.

Fractional coloring gives valid index code

Any fractional coloring with total weight $W$ can be "rounded" to a scheme of rate $W + o(1)$ by repeating the fractional scheme long enough. Gives $\beta \leq \chi_f$ .

Independence lower bound

Duality: LP dual of fractional chromatic is the fractional clique number, which equals the fractional independence number for perfect graphs. In general, $\alpha \leq \beta \leq \chi_f$ ; with equality under special conditions. $\blacksquare$

MAN Rate vs LP Bound vs Uncoded

Compare, as a function of $K$ , the MAN integer- $t$ envelope rate with the LP lower bound $K(1-\mu)/(1+K\mu)$ (the continuous optimum, matching the memory-sharing lower envelope) and the uncoded upper bound $K(1-\mu)$ . The MAN rate is at most one more color than the LP bound — the integer- $t$ rounding gap.

Parameters

Memory ratio M/N0.2

Max K40

⚠️Engineering Note

What Deployed Systems Use

Real systems that exploit coded multicasting use one of three strategies:

MAN-style placement + closed-form delivery. The approach of this book: design the placement so the delivery is pre-solved. Deployable; limited by subpacketization (Ch 14).
Greedy coloring at runtime. Heuristic; near-optimal in practice for small instances; does not scale to hundreds of users.
PDA / linear-code-based schemes (Ch 14). Polynomial subpacketization, approximate LP bound. Deployable at moderate $K$ .

Few production systems attempt to solve index coding online. The value of the index coding framework is conceptual: it places coded caching within a larger family and lets us import bounds and intuition from a well-studied literature.

Practical Constraints

•
Optimal index coding is NP-hard; runtime solution infeasible for K > 30
•
Greedy coloring gives near-optimal rates for random instances
•
MAN schemes pre-solve the index coding problem via combinatorial placement

Common Mistake: Greedy Coloring Is Not Always Optimal

Mistake:

Expecting greedy coloring to achieve the chromatic number.

Correction:

Greedy coloring's output depends on the vertex ordering, and in the worst case it uses up to $\Delta(G) + 1$ colors — which can be far from $\chi(G)$ . For specific orderings (Welsh–Powell, degree of saturation), greedy performs well, but the problem is still NP-hard. For MAN conflict graphs the natural ordering (by subset lexicographic order) gives the optimal $(K-t)/(t+1)$ colors — but this is a happy accident of the combinatorial structure, not a general property of greedy.

NP-Hardness and Practical Heuristics

Why Index Coding Is Hard

Theorem: NP-Hardness of Linear Index Coding

Reduce chromatic number to minrank

Connect to index coding

Non-linear rate

Consequence

Greedy Coloring Heuristic for Index Coding

Greedy Coloring on a MAN Instance

Parameters

Definition: Clique Cover Bound

Theorem: LP Relaxation Bound

LP formulation

Fractional coloring gives valid index code

Independence lower bound

MAN Rate vs LP Bound vs Uncoded

Parameters

What Deployed Systems Use

Common Mistake: Greedy Coloring Is Not Always Optimal

Definition:
Clique Cover Bound