Arithmetic Coding
Beyond One Bit Per Symbol
Huffman coding has a fundamental limitation: it assigns an integer number of bits to each symbol. When the entropy is, say, 0.1 bits per symbol, Huffman must still use at least 1 bit per symbol β a 10x overhead. Arithmetic coding sidesteps this by encoding an entire sequence as a single number in , using a number of bits that can be a fraction per symbol. The idea is beautifully simple: represent the probability interval of the sequence, and the code length equals plus at most 2 bits of overhead for the entire sequence.
Definition: Arithmetic Code
Arithmetic Code
An arithmetic code for a source with distribution on represents a sequence by a binary fraction in the sub-interval of corresponding to . The encoding proceeds iteratively:
- Start with the interval .
- For each symbol , subdivide the current interval into sub-intervals proportional to .
- Select the sub-interval corresponding to and continue.
- The final interval has width .
- Output any binary fraction in this interval, using bits.
The genius of arithmetic coding is that it amortizes the rounding loss over the entire sequence. While Huffman loses up to 1 bit per symbol, arithmetic coding loses at most 2 bits per sequence β regardless of sequence length.
Example: Arithmetic Coding of a Binary Sequence
Encode the sequence using arithmetic coding with , .
Initialize
Start with interval .
Encode $x_1 = a$
Subdivide : subinterval for is , for is . Select .
Encode $x_2 = b$
Subdivide : subinterval for is , for is . Select .
Encode $x_3 = a$
Subdivide : width . Subinterval for is , for is . Select .
Output
Final interval: , width . We need bits. Binary fraction: is in . So the codeword is .
Total: 4 bits for 3 symbols bits/symbol. For comparison: bits/symbol. The overhead is bits/symbol β large for , but the overhead per symbol shrinks as for longer sequences.
Theorem: Rate of Arithmetic Coding
For a DMS with distribution , the arithmetic code for a length- sequence uses at most bits. The expected rate satisfies As , the rate approaches .
Arithmetic coding achieves essentially the ideal code length for each sequence, with only a constant (2-bit) overhead independent of . This is why arithmetic coding has essentially replaced Huffman in modern compression systems β it approaches entropy without block coding, and it adapts naturally to non-stationary and context-dependent sources.
Code length bound
The output is a binary fraction identifying a point in the interval of width . A binary fraction with bits has precision . We need , i.e., . Adding 1 bit to specify which half of the interval ensures the point lies strictly inside the interval: total bits.
Expected rate
n\leq H(X) + 2/n\geq H(X)$ follows from the source coding converse.
Adaptive Arithmetic Coding
In practice, arithmetic coding is used with adaptive probability models: the distribution is updated after each symbol based on the context. This allows the coder to exploit dependencies in the source without knowing the true distribution in advance. The resulting code length is approximately , which approaches the entropy rate for any stationary ergodic source if the model is sufficiently expressive. Modern compression algorithms (LZMA, Zstandard, CABAC in H.264/HEVC) all use variants of adaptive arithmetic coding.
Huffman vs. Arithmetic Coding
| Property | Huffman | Arithmetic |
|---|---|---|
| Optimality | Optimal among symbol-by-symbol prefix codes | Approaches entropy for any blocklength |
| Overhead | Up to 1 bit/symbol | Up to 2 bits/sequence ( per symbol) |
| Low-entropy sources | Poor (minimum 1 bit/symbol) | Excellent (fractional bits/symbol) |
| Adaptivity | Requires rebuilding tree | Naturally adaptive with context models |
| Complexity | construction | encoding with finite precision |
| Patent status | Unencumbered | Historical patents (now expired) |
Finite-Precision Arithmetic Coding
The theoretical description of arithmetic coding uses arbitrary-precision real numbers, which is impractical. Real implementations use fixed-precision integers (typically 32 or 64 bits) and maintain the interval using integer arithmetic. The key technique is renormalization: when the interval becomes narrow enough that the leading bits are determined, those bits are emitted and the interval is rescaled. This allows streaming encoding and decoding with constant memory, regardless of sequence length. The finite-precision version adds at most 1 extra bit to the ideal code length.
Quick Check
For a source with bits/symbol (very low entropy), encoding symbols, approximately how many bits does arithmetic coding use?
1000 bits (at least 1 bit per symbol)
12 bits
100 bits
0.01 bits
Arithmetic coding uses approximately bits. This is a 83x improvement over Huffman's minimum of 1000 bits.
Why This Matters: Arithmetic Coding in Wireless Video
Context-Adaptive Binary Arithmetic Coding (CABAC) is the entropy coding engine in H.264/AVC, HEVC, and VVC β the standards used for video transmission over wireless channels. CABAC adapts its probability models to the local context of transform coefficients, achieving 5β15% better compression than Huffman-based alternatives (CAVLC). In 5G NR, HEVC with CABAC is the baseline for video services. The arithmetic coding framework from this section is exactly what CABAC implements, with context-dependent probability updates. See Book telecom, Ch. 11 for information-theoretic foundations of source coding in communications.
Key Takeaway
Arithmetic coding encodes an entire sequence as a single number in , using approximately bits β the information-theoretic ideal. The per-symbol overhead is , vanishing with blocklength. Unlike Huffman, arithmetic coding handles low-entropy sources efficiently and adapts naturally to context-dependent models. It is the foundation of all modern practical compression systems.