Types, Type Classes, and Their Properties
Why the Method of Types?
In Chapter 3 we used typicality to prove Shannon's source and channel coding theorems. The proofs were elegant, but the bounds we obtained were asymptotic — they told us what happens as the blocklength , but said nothing about how fast things converge. The method of types is a more refined combinatorial tool that gives us exponential control over probabilities. The key insight is deceptively simple: instead of asking "is this sequence typical?", we ask "what is the exact empirical distribution of this sequence?" This single shift in perspective unlocks error exponents — the rate at which error probabilities decay to zero.
Definition: Type (Empirical Distribution)
Type (Empirical Distribution)
Let be a sequence of length over a finite alphabet . The type (or empirical distribution) of is the probability distribution on defined by We write where is the number of occurrences of symbol in .
The type captures everything about a sequence that matters for i.i.d. probability calculations: if two sequences have the same type, they have the same probability under any i.i.d. source.
Definition: Type Class
Type Class
The type class of a distribution is the set of all sequences of length having type : The type class is also called the composition class or shell of .
Each sequence in belongs to exactly one type class, so the type classes partition .
Definition: Set of Types
Set of Types
The set of all possible types with denominator on alphabet is This is a finite set — the number of types is polynomial in .
Theorem: Polynomial Bound on the Number of Types
The number of types with denominator on an alphabet of size satisfies
Each type is determined by choosing non-negative integers that sum to . The number of such compositions is , which is polynomial in for fixed . The point is that the number of types grows only polynomially in , while the number of sequences grows exponentially. This polynomial-vs-exponential gap is what makes the method of types so powerful.
Stars and bars counting
Each type is determined by the counts where each count is a non-negative integer and the counts sum to . The number of such tuples is
Upper bound
Since , we obtain .
Theorem: Size of the Type Class
For any type : where is the entropy of the distribution .
The type class has approximately elements — this is a precise version of the AEP. The entropy of the empirical distribution determines the exponential growth rate of the type class, up to a polynomial factor. Intuitively, sequences with "more uniform" types (higher entropy) come from larger type classes.
Multinomial counting for the upper bound
The size of is the multinomial coefficient: By Stirling's approximation, , we get Therefore .
Lower bound from partitioning
Since the type classes partition and there are at most types: For the specific type , we use that is the same for all (under the i.i.d. distribution ), and since probabilities sum to at most 1: for any . Setting gives . For the lower bound, the fact that the type classes cover with at most classes yields
Theorem: Probability of a Type Class under an i.i.d. Source
If are drawn i.i.d. according to on , then for any type : More precisely:
The probability of seeing a type when the true distribution is decays exponentially at rate . The farther is from in KL divergence, the less likely we are to see that type. When , the KL divergence is zero and the probability is essentially 1 (the most likely type). This is the method of types' sharpened version of the law of large numbers.
Probability of a single sequence
For any : This is the same for every sequence in — this is the key simplification.
Multiply by type class size
Using the bounds on : Substituting gives the result. The entropy terms cancel, leaving only up to the polynomial factor.
Example: Types of Binary Sequences
Let and . Enumerate all types, their type classes, and compute for each type when .
Enumerate types
There are types on a binary alphabet: where each pair is .
Type classes
- ,
- ,
- ,
- ,
- ,
Notice up to polynomial factors: is the loosest bound; for small the polynomial factor matters.
Probabilities under $P = \text{Bernoulli}(1/3)$
With , :
The most probable type is , which is the closest type to . As grows, the most probable type converges to .
Definition: Joint Type and Conditional Type
Joint Type and Conditional Type
For a pair of sequences , the joint type is the empirical joint distribution: Given with type , a conditional type is a stochastic matrix such that . The conditional type class is
Theorem: Size of the Conditional Type Class
For any and conditional type : where is the conditional entropy of given under the joint distribution .
Reduction to marginal type classes
For each value , the subsequence of at positions where must have type over . The number of such subsequences is the product over of the type class sizes for the conditional distributions, which by the previous theorem gives the exponential rate .
Quick Check
For a ternary alphabet and , the number of distinct types is at most:
18
28
343
729
The exact count is . The looser bound gives .
Type (empirical distribution)
The relative frequency of each symbol in a sequence. For , . Two sequences with the same type have the same probability under any i.i.d. source.
Related: Type Class
Type class
The set of all sequences in whose empirical distribution equals . Its size is approximately .
Related: Type (Empirical Distribution)
Exponential equality ()
Notation for equality to first order in the exponent: means . Equivalently, the ratio of the logarithms converges. This ignores polynomial prefactors.
Common Mistake: Types vs. Typical Sequences
Mistake:
Confusing the type class with the typical set . A student might think "the typical set is just the type class of ."
Correction:
The typical set is the union of all type classes for which is close to . Specifically, (for weakly typical sets) or (for strongly typical sets). The typical set is a coarse-grained object; individual type classes are the fine-grained building blocks.
Common Mistake: Ignoring Polynomial Factors Too Early
Mistake:
Using the notation to conclude that exactly, then performing non-exponential operations (like summing over polynomially many types) without checking that the polynomial factors cancel.
Correction:
The notation absorbs polynomial factors, which is fine for exponential-rate arguments. But when you sum over types (a polynomial number), the polynomial factors contribute at most another polynomial, which is still absorbed. The danger arises when you try to extract exact constants (not just exponents) — there, the method of types gives only the exponential rate, and you need refined asymptotics for the prefactor.
Historical Note: Csiszár and Körner: The Hungarian School
1970s–1980sThe method of types was systematically developed by Imre Csiszár and János Körner in their landmark 1981 monograph Information Theory: Coding Theorems for Discrete Memoryless Systems. While the basic counting arguments were known earlier (types appear implicitly in Shannon's 1948 paper), Csiszár and Körner elevated the method into a complete framework capable of proving essentially all known coding theorems for discrete memoryless sources and channels — often with tighter results than the probabilistic (typicality-based) approach. The Hungarian school's contribution was to recognize that the combinatorial structure of types, rather than the probabilistic structure of typical sets, is the more fundamental object. Their approach yields error exponents naturally, whereas typicality only gives achievability at the right rate.
Type Classes: Partitioning the Sequence Space
Key Takeaway
The method of types replaces probabilistic typicality arguments with exact combinatorial counting. The three fundamental facts are: (1) the number of types is polynomial in , (2) each type class has elements, and (3) the probability of a type class under an i.i.d. source decays as . Together, these yield exponentially tight bounds on all quantities of interest.