References & Further Reading

References

  1. T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley, 2nd ed., 2006

    Chapters 5 (data compression) and 13 (universal source coding) cover the material in this chapter. Our notation follows Cover & Thomas closely.

  2. D. A. Huffman, A Method for the Construction of Minimum-Redundancy Codes, 1952

    The original paper describing the Huffman algorithm. Written as a term paper for Robert Fano's information theory course at MIT β€” Fano had posed the problem of finding optimal codes, and Huffman solved it as a student.

  3. J. Rissanen, Generalized Kraft Inequality and Arithmetic Coding, 1976

    The foundational paper on arithmetic coding as a practical compression method. Builds on earlier ideas by Elias (1975, unpublished) and Shannon.

  4. J. Ziv and A. Lempel, A Universal Algorithm for Sequential Data Compression, 1977

    The LZ77 (sliding-window) algorithm. The basis for gzip, DEFLATE, and most practical compression tools in use today.

  5. J. Ziv and A. Lempel, Compression of Individual Sequences via Variable-Rate Coding, 1978

    The LZ78 (dictionary-based) algorithm. Proves universality for individual sequences under the finite-state compressibility framework.

  6. F. M. J. Willems, Y. M. Shtarkov, and T. J. Tjalkens, The Context-Tree Weighting Method: Basic Properties, 1995

    Introduces context tree weighting, which achieves near-optimal redundancy for tree sources. Faster convergence than LZ for Markov-like sources.

Further Reading

For readers who want to go deeper into source coding theory and practice.

  • Arithmetic coding implementation details

    I. H. Witten, R. M. Neal, and J. G. Cleary, 'Arithmetic Coding for Data Compression,' Communications of the ACM, vol. 30, no. 6, pp. 520–540, Jun. 1987

    The definitive practical guide to implementing arithmetic coding with finite precision. Explains the renormalization trick that makes the algorithm work in constant memory.

  • Modern compression: ANS and Zstandard

    J. Duda, 'Asymmetric Numeral Systems: Entropy Coding Combining Speed of Huffman Coding with Compression Rate of Arithmetic Coding,' arXiv:1311.2540, 2013

    Asymmetric numeral systems (ANS) achieve arithmetic-coding compression rates with Huffman-like speed. Used in Zstandard (Facebook), Apple LZFSE, and JPEG XL.

  • Redundancy of universal codes

    N. Merhav and M. Feder, 'Universal Prediction,' IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2124–2147, Oct. 1998

    Detailed analysis of the redundancy (excess rate above entropy) of universal codes including minimax and Bayesian approaches. Connects to MDL and statistical learning.

  • Source coding with side information

    D. Slepian and J. K. Wolf, 'Noiseless Coding of Correlated Information Sources,' IEEE Trans. Inform. Theory, vol. 19, no. 4, pp. 471–480, Jul. 1973

    The foundational paper on distributed lossless compression β€” covered in Chapter 7 of this book. Shows that two correlated sources can be compressed separately at rates approaching their joint entropy.