References & Further Reading

References

  1. P. Billingsley, Probability and Measure, Wiley, 3rd ed., 1995

    The standard measure-theoretic probability reference. Chapter 4 covers conditional probability and independence with full rigor.

  2. W. Feller, An Introduction to Probability Theory and Its Applications, Vol. I, Wiley, 3rd ed., 1968

    The classical two-volume reference. Chapter V covers conditional probability and Bayes' theorem with extensive examples. Chapter VI derives the binomial, geometric, and negative binomial distributions.

  3. J. Pitman, Probability, Springer, 1993

    Excellent undergraduate text with clear geometric intuition for conditional probability and independence. Chapter 4 is directly relevant.

  4. T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley-Interscience, 2nd ed., 2006

    Chapter 2 introduces conditional entropy and mutual information, building directly on the conditional probability framework of this chapter.

  5. T. Bayes, An Essay towards Solving a Problem in the Doctrine of Chances, 1763

    The original posthumous paper by Thomas Bayes, edited by Richard Price. The inverse probability problem and the theorem that bears his name.

  6. J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, 1988

    The foundational text on Bayesian networks and belief propagation. Chapters 1–2 introduce conditional independence and graphical models. Essential background for Book FSI's treatment of factor graphs.

  7. F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, Factor Graphs and the Sum-Product Algorithm, 2001

    The paper that unified belief propagation, the Viterbi algorithm, and the BCJR algorithm under the factor graph framework. The connection to conditional independence in Section 2.4 is made explicit here.

  8. J. G. Proakis and M. Salehi, Digital Communications, McGraw-Hill, 5th ed., 2008

    Chapter 8 applies the binomial model to block code error probabilities. Chapter 5 covers MAP and ML detection in AWGN — Bayes' theorem applied.

  9. A. A. Markov, Extension of the Law of Large Numbers to Dependent Variables, 1906

    The original 1906 paper introducing Markov chains as a counterexample to claims that the law of large numbers requires independence.

  10. J. O. Berger, Statistical Decision Theory and Bayesian Analysis, Springer, 2nd ed., 1985

    Comprehensive treatment of Bayesian decision theory. Chapters 1–4 develop the prior-posterior framework rigorously. Connects directly to the MAP detection theory in Book FSI.

  11. A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin, Bayesian Data Analysis, CRC Press, 3rd ed., 2013

    The standard reference for applied Bayesian inference. Chapter 1 gives a thorough discussion of the prior-to-posterior update and the role of Bayes' theorem in practical inference.

Further Reading

Resources for deeper exploration of the topics in Chapter 2.

  • Conditional probability given a zero-probability event

    P. Billingsley, *Probability and Measure*, 3rd ed., Wiley 1995, §33 (regular conditional distributions).

    The definition $\mathbb{P}(A \mid B) = \mathbb{P}(A \cap B)/\mathbb{P}(B)$ breaks down when $\mathbb{P}(B) = 0$. Regular conditional distributions handle this correctly and are essential for Chapter 12.

  • Bayesian networks and d-separation

    J. Pearl, *Probabilistic Reasoning in Intelligent Systems*, Morgan Kaufmann 1988, Ch. 3.

    Formalizes conditional independence through graph structure. The d-separation criterion provides an efficient algorithm for reading off all conditional independence relationships from a Bayesian network.

  • The Monty Hall problem — deep analysis

    J. P. Morgan, N. R. Chaganty, R. C. Dahiya, and M. J. Doviak, 'Let's make a deal: the player's dilemma,' *The American Statistician*, 1991.

    Shows that the standard Monty Hall answer depends critically on the assumption that the host chooses uniformly at random. A careful Bayesian analysis reveals the correct answer under general host strategies.

  • Second Borel-Cantelli lemma and independence

    W. Feller, *An Introduction to Probability Theory*, Vol. I, 3rd ed., Wiley 1968, §IV.6.

    The second Borel-Cantelli lemma requires independence. This section places it in the context of the theory developed in Chapter 2 and Chapter 1.

  • Binomial distribution: normal and Poisson approximations

    W. Feller, *An Introduction to Probability Theory*, Vol. I, 3rd ed., Wiley 1968, §VII.3-VII.4.

    Precise asymptotic results: de Moivre-Laplace theorem (binomial to Gaussian) and Poisson approximation theorem, with error bounds.