The AWGN Channel
Why the Gaussian Channel?
Every wireless, wired, and optical communication system ultimately faces additive noise. The central limit theorem tells us that the aggregate of many small independent disturbances converges to a Gaussian distribution β and this is precisely what we observe in practice: thermal noise in receivers, shot noise in photodetectors, and aggregate interference in dense networks are all well-modeled as Gaussian.
The Gaussian channel is therefore not just a mathematical convenience β it is the canonical model for communication under noise. Its capacity formula, , is arguably the single most important equation in all of communication theory. It tells every engineer, before writing a single line of code, the absolute limit of what is achievable.
Definition: The Scalar AWGN Channel
The Scalar AWGN Channel
The additive white Gaussian noise (AWGN) channel is defined by
where:
- is the channel input at time ,
- are i.i.d. Gaussian noise samples, independent of the input,
- is the channel output.
The encoder maps a message to a codeword subject to the average power constraint:
AWGN channel
Additive White Gaussian Noise channel: with i.i.d. and an average power constraint on the input. The most fundamental continuous-alphabet channel model in information theory.
Related: Signal-to-noise ratio (SNR), Channel capacity
Signal-to-noise ratio (SNR)
The ratio of average signal power to noise power: . In decibels, .
Related: AWGN channel
Why an Average Power Constraint?
In practice, transmitters have a finite energy budget (battery, power amplifier limits). The average power constraint models the total energy per symbol. We could also impose a peak power constraint , but the average constraint is more natural for information-theoretic analysis and yields cleaner results. The peak constraint leads to a harder optimization β the capacity-achieving input distribution becomes discrete (Smith, 1971) rather than Gaussian.
Theorem: Capacity of the Scalar AWGN Channel
The capacity of the AWGN channel with and average power constraint is
where . The capacity-achieving input distribution is .
The formula says that doubling the SNR buys you roughly one extra bit per channel use (at high SNR). Intuitively, the Gaussian input spreads energy as "evenly" as possible across the signal space β any other distribution with the same power produces less entropy at the output, hence less mutual information.
Achievability β upper bound on $\ntn{hd}(Y)$
We compute and maximize over .
Since is independent of , we have .
For the output entropy, note that . By the Gaussian entropy maximizer (Theorem 2.X), we have , with equality if and only if is Gaussian β which happens when .
Achievability β capacity expression
Combining the two terms:
The maximum is achieved by , confirming that the Gaussian input is optimal.
Converse β Fano's inequality approach
For any code with vanishing error probability, Fano's inequality gives
where . Since the channel is memoryless:
Converse β bounding $\ntn{hd}(Y^n)$
By the independence bound and the Gaussian maximizer:
By the concavity of and the power constraint , Jensen's inequality yields
Dividing by and letting : .
Key Takeaway
The AWGN capacity is the single most important formula in communication theory. The Gaussian input is optimal because it maximizes the output entropy under a power constraint β a direct consequence of the entropy maximization property of the Gaussian distribution.
Historical Note: Shannon's 1948 Paper and the Gaussian Channel
Shannon derived the Gaussian channel capacity in his landmark 1948 paper "A Mathematical Theory of Communication." What is remarkable is that Shannon not only gave the formula but also proved both achievability (via random coding with Gaussian codebooks) and the converse β all in the same paper that invented the field.
The result was initially met with skepticism: how could one transmit reliably at any positive rate over a noisy channel? The key insight was that coding over long blocks concentrates the noise around a thin shell, and the number of distinguishable signal spheres grows exponentially with the block length. It took nearly 50 years for practical codes (turbo codes, LDPC codes) to approach Shannon's limit within a fraction of a dB.
The Sphere-Packing Picture
There is a beautiful geometric interpretation of the AWGN capacity. Consider transmission of symbols:
- The codeword has energy , so all codewords lie in a sphere of radius in .
- The noise vector concentrates (with high probability) on a thin shell of radius .
- The received vector lies in a sphere of radius .
For reliable decoding, the "noise spheres" centered at different codewords must not overlap. The number of non-overlapping noise spheres that fit is
Taking of this count gives exactly .
Sphere-Packing Interpretation of AWGN Capacity
Sphere-Packing Interpretation of AWGN Capacity
Definition: The Complex AWGN Channel
The Complex AWGN Channel
The complex AWGN channel models passband communication via complex baseband:
where is the (known, deterministic) channel gain and the power constraint is .
The capacity is
where .
The factor-of-two difference from the real case ( vs. ) arises because each complex symbol carries two real dimensions.
In wireless communications, the standard convention uses the complex model with . The capacity in bits/s is , where is the bandwidth in Hz.
Example: Computing AWGN Capacity
A wireless link operates at dB with bandwidth MHz. What is the maximum achievable data rate?
Convert SNR to linear
.
Compute capacity per symbol
bits per complex channel use.
Compute capacity in bits/s
The symbol rate equals the bandwidth for Nyquist signaling, so
Example: Required SNR for a Target Rate
What minimum (in dB) is needed to achieve a spectral efficiency of bits/s/Hz on a complex AWGN channel?
Set up the equation
We need , so .
Convert to dB
dB.
The point is that 4 bits/s/Hz requires roughly 12 dB of SNR β a useful rule of thumb for link budget calculations.
AWGN Channel Capacity vs. SNR
Explore how the AWGN capacity grows logarithmically with . At low SNR, capacity grows approximately linearly; at high SNR, each 3 dB increase adds roughly 1 bit/s/Hz.
Parameters
Quick Check
For an AWGN channel with (linear, not dB), what is the capacity in bits per real channel use?
bits
bits
bits
bits
The real AWGN capacity is bits per real channel use. The factor distinguishes the real channel from the complex channel.
Common Mistake: Real vs. Complex AWGN: The Factor of Two
Mistake:
Using for a real-valued AWGN channel (or for a complex channel).
Correction:
Real AWGN: bits per real symbol. Complex AWGN: bits per complex symbol. The complex channel has two real dimensions, hence the factor of two. When computing bits/s, both give because the complex symbol rate is half the real sample rate.
Common Mistake: SNR in dB vs. Linear
Mistake:
Plugging directly into the capacity formula: when dB.
Correction:
Always convert to linear scale first: . Then bits (real) β very different from bits.
Theorem: MMSE Lower Bound via Differential Entropy
For jointly distributed continuous random variables and , the minimum mean-square error (MMSE) satisfies
where is the optimal estimator. Equality holds if and only if given is Gaussian.
This theorem connects two seemingly different worlds: estimation theory (MMSE) and information theory (entropy). It says that the conditional entropy places a fundamental lower bound on how well you can estimate from . The Gaussian case is the "hardest to estimate" for a given entropy β any other conditional distribution with the same entropy is easier to estimate.
Best estimator is conditional mean
By standard estimation theory, for any estimator .
Iterated expectation
The MMSE equals .
Apply the Gaussian entropy maximizer conditionally
For each , the Gaussian with variance maximizes entropy. Therefore , which gives .
Average over $Y$ and apply Jensen's inequality
Taking expectations and using the convexity of :
Example: Verifying the MMSE Bound for Jointly Gaussian Variables
Let be jointly Gaussian with zero mean, , , and correlation coefficient . Verify that the MMSE lower bound holds with equality.
Compute the MMSE
For jointly Gaussian variables, the MMSE estimator is linear: . The MMSE is .
Compute the conditional entropy
, so .
Verify equality
Equality holds because is Gaussian.
Why This Matters: AWGN Capacity and Spectral Efficiency in 5G NR
The AWGN capacity formula is the benchmark against which every practical modulation and coding scheme is measured. In 5G NR, adaptive modulation and coding (AMC) selects the highest-rate modulation-coding scheme (MCS) that the current can support. The Shannon limit tells us how close each MCS comes to the theoretical maximum. Modern LDPC and polar codes in 5G NR operate within 1β2 dB of the AWGN capacity for long block lengths.
See Book telecom, Ch. 14 for the detailed treatment of AMC and link adaptation, and Book telecom, Ch. 12 for the coding schemes that approach this limit.
Quick Check
If we double the transmit power (keeping noise fixed), by how much does the AWGN capacity increase?
It exactly doubles
It increases by bit per real channel use
It increases by bits only when
It does not change
At high SNR where , , so doubling adds bits. At low SNR, the gain from doubling power is larger. The logarithmic growth of capacity with power is a fundamental feature: power is an expensive resource for buying rate.