References & Further Reading

References

A. D. Wyner, The common information of two dependent random variables, 1975
Introduces common information as the minimum rate to render two variables conditionally independent. A fundamental measure of shared structure beyond mutual information.
R. Ahlswede and J. Korner, Source coding with side information and a converse for degraded broadcast channels, 1975
Establishes the rate region for source coding with a helper and connects it to broadcast channel capacity.
S. S. Pradhan and K. Ramchandran, Distributed source coding using syndromes (DISCUS): Design and construction, 2003
The practical breakthrough for Slepian-Wolf coding using LDPC syndromes.
R. Puri, A. Majumdar, and K. Ramchandran, PRISM: A new robust video coding architecture based on distributed compression principles, 2002
Practical distributed video coding architecture based on Wyner-Ziv and Slepian-Wolf principles.
A. Aaron and B. Girod, Compression with side information using turbo codes, 2002
Turbo-code-based distributed video coding, complementing the LDPC-based DISCUS approach.
N. Tishby, F. C. Pereira, and W. Bialek, The information bottleneck method, 1999
Introduces the information bottleneck as a rate-distortion framework for extracting relevant information. A foundational paper connecting information theory to representation learning.
G. Chechik, A. Globerson, N. Tishby, and Y. Weiss, Information bottleneck for Gaussian variables, 2005
Analytical solution of the information bottleneck for jointly Gaussian variables.
D. P. Kingma and M. Welling, Auto-encoding variational Bayes, 2014
Introduces the variational autoencoder (VAE), whose loss function is a rate-distortion objective with log-loss distortion.
A. A. Alemi, B. Poole, I. Fischer, J. V. Dillon, R. A. Saurous, and K. Murphy, Fixing a broken ELBO, 2018
Formalizes the rate-distortion interpretation of VAEs and the connection to the information bottleneck.
R. Shwartz-Ziv and N. Tishby, Opening the black box of deep neural networks via information, 2017
Proposes that deep networks undergo information compression during training. Influential but partially challenged by subsequent work.