Ferkans — Interactive Telecom Tutor

Why Information-Theoretic Secrecy?

In every channel model we have studied so far, the goal was to maximize the rate of reliable communication. We never asked: can someone else, listening to the transmission, learn the message? In cryptography, this problem is solved by encryption — the message is scrambled with a shared secret key before transmission. But what if the transmitter and receiver do not share a secret key?

Information-theoretic secrecy takes a fundamentally different approach: it exploits the physical properties of the communication channel itself to provide confidentiality. If the eavesdropper's channel is noisier than the legitimate receiver's, we can transmit at a rate that the receiver can decode but the eavesdropper cannot — and we can prove this mathematically, without relying on computational hardness assumptions.

The point is that the noise in the channel, which we have been fighting against throughout this book, now becomes our ally.

Definition:
The Wiretap Channel

The wiretap channel consists of a transmitter, a legitimate receiver, and an eavesdropper. The transmitter sends $X^n$ over a discrete memoryless channel. The legitimate receiver observes $Y^n$ through the main channel $P_{Y|X}$ , and the eavesdropper observes $Z^n$ through the wiretap channel $P_{Z|X}$ .

The channel is degraded if $X \to Y \to Z$ forms a $X \multimap Y \multimap Z$ chain, meaning the eavesdropper sees a noisier version of what the legitimate receiver sees.

A $(2^{nR}, n)$ wiretap code consists of:

Stochastic encoder: a (possibly randomized) mapping $f: \{1, \ldots, 2^{nR}\} \to \mathcal{X}^n$
Decoder: a mapping $g: \mathcal{Y}^n \to \{1, \ldots, 2^{nR}\}$

The code must satisfy two requirements:

Reliability: $P_e^{(n)} = \Pr[\hat{M} \neq M] \to 0$
Secrecy: $\frac{1}{n}I(M; Z^n) \to 0$ (weak secrecy) or $I(M; Z^n) \to 0$ (strong secrecy)

The encoder is stochastic — for each message $M$ , it randomly selects among multiple codewords. This randomization is the key mechanism for confusing the eavesdropper. It is reminiscent of random binning in Slepian–Wolf coding, and indeed the achievability proof uses the same technique.

Wiretap channel

A channel model with three parties: a transmitter, a legitimate receiver, and an eavesdropper. The goal is to communicate reliably to the receiver while keeping the message secret from the eavesdropper.

Secrecy capacity

The maximum rate at which reliable and secret communication is possible over a wiretap channel. Denoted $C_s$ .

Related: Wiretap channel

Equivocation rate

The normalized conditional entropy $\frac{1}{n}H(M | Z^n)$ , measuring the eavesdropper's uncertainty about the message after observing the wiretap channel output. Perfect secrecy requires the equivocation rate to approach the message rate.

Definition:
Weak and Strong Secrecy

Two notions of secrecy are commonly used:

Weak secrecy: $\frac{1}{n}I(M; Z^n) \to 0$ as $n \to \infty$ . The per-symbol information leakage vanishes.
Strong secrecy: $I(M; Z^n) \to 0$ as $n \to \infty$ . The total information leakage vanishes.

Strong secrecy is strictly stronger than weak secrecy. The difference matters because under weak secrecy, the total leakage $I(M; Z^n)$ can grow sublinearly with $n$ — the eavesdropper may learn a vanishing fraction but a growing number of bits. Under strong secrecy, even the total leakage vanishes.

Remarkably, the secrecy capacity is the same under both notions: the achievable rate does not change when we strengthen the secrecy requirement from weak to strong.

Theorem: Secrecy Capacity of the Degraded Wiretap Channel

For a degraded wiretap channel $X \to Y \to Z$ , the secrecy capacity is $C_s = \max_{P_X} \bigl[I(X; Y) - I(X; Z)\bigr].$

More generally, for a non-degraded wiretap channel, the secrecy capacity is $C_s = \max_{P_{V,X}} \bigl[I(V; Y) - I(V; Z)\bigr]$ where $V \to X \to (Y, Z)$ forms a $X \multimap Y \multimap Z$ chain and $|\mathcal{V}| \leq |\mathcal{X}|$ .

The secrecy capacity is the difference between the mutual information to the legitimate receiver and the mutual information to the eavesdropper. Intuitively, we can communicate secretly at a rate equal to the "advantage" the main channel has over the wiretap channel. If the eavesdropper's channel is better than the legitimate receiver's ( $I(X;Z) > I(X;Y)$ for all $P_X$ ), then $C_s = 0$ — no secret communication is possible.

The achievability uses a beautiful combination of random coding and random binning. We generate a codebook with $2^{n(I(X;Y) - \epsilon)}$ codewords, partition them into $2^{nR}$ bins (one per message), and for each message, randomly select a codeword from its bin. The legitimate receiver can decode (the codewords are distinguishable through the main channel), but the eavesdropper cannot determine which bin the codeword belongs to because the randomization within each bin creates confusion.

Proof

Achievability — Codebook generation

Fix $P_X$ and a target rate $R < I(X;Y) - I(X;Z)$ .

Generate $2^{n\tilde{R}}$ codewords $x^n(m, l)$ i.i.d. $\sim \prod_{i=1}^n P_X(x_i)$ , indexed by message $m \in \{1, \ldots, 2^{nR}\}$ and randomization index $l \in \{1, \ldots, 2^{n\tilde{R}_l}\}$ , where $\tilde{R} = R + \tilde{R}_l$ and $\tilde{R}_l = I(X; Z) + \epsilon$ .

Achievability — Encoding

To send message $m$ , the encoder selects $l$ uniformly at random from $\{1, \ldots, 2^{n\tilde{R}_l}\}$ and transmits $x^n(m, l)$ .

The key idea: each message $m$ has $2^{n\tilde{R}_l}$ possible codewords. The randomization index $l$ is unknown to both the receiver and the eavesdropper.

Achievability — Decoding and reliability

The legitimate receiver decodes both $(m, l)$ using joint typicality decoding. Since $\tilde{R} = R + \tilde{R}_l < I(X;Y)$ (by our choice of rates), the receiver can reliably recover $(m, l)$ .

Achievability — Secrecy analysis

For the eavesdropper, the effective rate of the randomization is $\tilde{R}_l = I(X;Z) + \epsilon$ . Since $\tilde{R}_l > I(X;Z)$ , the eavesdropper cannot distinguish the different randomization indices $l$ for a given message $m$ . By the packing lemma and standard equivocation analysis: $\frac{1}{n}H(M | Z^n) \geq R - \epsilon_n$ where $\epsilon_n \to 0$ . This gives weak secrecy. Strong secrecy follows from a more refined analysis using the soft covering lemma.

Converse

For any sequence of $(2^{nR}, n)$ codes with $P_e^{(n)} \to 0$ and $\frac{1}{n}I(M; Z^n) \to 0$ :

$nR = H(M) = I(M; Y^n) + H(M|Y^n)$ $\leq I(M; Y^n) + n\epsilon_n \quad \text{(Fano)}$ $= I(M; Y^n) - I(M; Z^n) + I(M; Z^n) + n\epsilon_n$ $\leq I(M; Y^n) - I(M; Z^n) + n\delta_n + n\epsilon_n$

where $\delta_n \to 0$ by the secrecy constraint. The first difference term is bounded using the degradedness and the chain rule to yield $\sum_{i=1}^n [I(X_i; Y_i) - I(X_i; Z_i)] \leq n \max_{P_X}[I(X;Y) - I(X;Z)]$ . Dividing by $n$ and taking $n \to \infty$ gives $R \leq C_s$ . $\blacksquare$

,

Random Binning: From Source Coding to Secrecy

Notice the deep connection between the achievability proof here and the Slepian–Wolf proof in Chapter 7. In Slepian–Wolf coding, random binning compresses correlated sources by assigning multiple source sequences to the same bin. In the wiretap channel, random binning creates secrecy by assigning multiple codewords to the same message.

This is the same technique — random binning — used for a completely different purpose. In source coding, binning exploits decoder side information to reduce rate. In secrecy coding, binning exploits the eavesdropper's noise to hide information. The unifying mathematical structure is what makes information theory so powerful.

Historical Note: Wyner's Wiretap Channel (1975)

1975

Aaron Wyner introduced the wiretap channel model in 1975, motivated by the question of whether the physical layer of a communication system could provide secrecy guarantees. His key insight was that if the eavesdropper's channel is a degraded version of the main channel, the noise difference can be exploited for secrecy — without any shared secret key.

Wyner's result was initially viewed as a theoretical curiosity: in wired communication, ensuring that the eavesdropper has a worse channel is unrealistic. But in wireless communication, where the channel to different receivers is inherently different due to path loss, fading, and antenna geometry, the wiretap model became highly relevant. The explosion of interest in physical-layer security after 2005 is a direct consequence of the wireless revolution validating Wyner's original insight.

Example: BSC Wiretap Channel

The main channel is a BSC( $p$ ) and the wiretap channel is a BSC( $q$ ) with $0 \leq p < q \leq 1/2$ (the eavesdropper's channel is noisier). Compute the secrecy capacity.

Solution

Verify degradedness

The BSC( $q$ ) is a degraded version of the BSC( $p$ ) when $q = p * q'$ (convolution) for some $q'$ , where $a * b = a(1-b) + b(1-a)$ . For $p < q \leq 1/2$ , we can write $q = p * \frac{q-p}{1-2p}$ , so the channel is degraded.

Compute mutual informations

For the BSC with input $X \sim \text{Bern}(1/2)$ (which maximizes $I$ for BSC): $I(X; Y) = 1 - h(p), \quad I(X; Z) = 1 - h(q).$

Secrecy capacity

$C_s = \max_{P_X}[I(X;Y) - I(X;Z)] = [1 - h(p)] - [1 - h(q)] = h(q) - h(p).$ $The secrecy capacity is the difference in binary entropy functions. When$ p = 0 $(noiseless main channel),$ C_s = h(q) $. When$ p = q $(identical channels),$ C_s = 0$.

Example: Gaussian Wiretap Channel

The main channel is $Y = X + N_Y$ with $N_Y \sim \mathcal{N}(0, \sigma^2_{Y})$ and the wiretap channel is $Z = X + N_Z$ with $N_Z \sim \mathcal{N}(0, \sigma^2_{Z})$ , where $\sigma^2_{Z} > \sigma^2_{Y}$ (eavesdropper is noisier). The transmitter has power constraint $P$ . Compute the secrecy capacity.

Solution

Mutual informations with Gaussian input

With $X \sim \mathcal{N}(0, P)$ : $I(X; Y) = \frac{1}{2}\log\left(1 + \frac{P}{\sigma^2_{Y}}\right), \quad I(X; Z) = \frac{1}{2}\log\left(1 + \frac{P}{\sigma^2_{Z}}\right).$

Secrecy capacity

The Gaussian distribution maximizes $I(X;Y) - I(X;Z)$ under the power constraint (this follows from the maximum entropy property and the degradedness of the Gaussian channel). Therefore: $C_s = \frac{1}{2}\log\left(1 + \frac{P}{\sigma^2_{Y}}\right) - \frac{1}{2}\log\left(1 + \frac{P}{\sigma^2_{Z}}\right) = \frac{1}{2}\log\frac{1 + P/\sigma^2_{Y}}{1 + P/\sigma^2_{Z}}.$

At high SNR, $C_s \approx \frac{1}{2}\log(\sigma^2_{Z} / \sigma^2_{Y})$ , which is determined by the noise ratio and independent of the transmit power. This is a fundamental difference from the non-secrecy capacity, which grows without bound with power.

Interpretation

The secrecy capacity saturates at high SNR — throwing more power at the problem does not help beyond a certain point. This is because both the main channel and the wiretap channel benefit equally from increased power. The secrecy advantage comes entirely from the noise difference $\sigma^2_{Z} - \sigma^2_{Y}$ , not from the absolute power level.

Common Mistake: Zero Secrecy Capacity Means No Security

Mistake:

Assuming that if $C_s = 0$ (e.g., when the eavesdropper has a better channel), it is impossible to achieve any form of secrecy.

Correction:

When $C_s = 0$ , information-theoretic secrecy at a positive rate is indeed impossible. However, computational secrecy (via encryption with a pre-shared key) still works regardless of the channel quality. Information-theoretic secrecy is a stronger guarantee but requires a channel advantage. The two approaches are complementary, not competing.

Common Mistake: Confusing Weak and Strong Secrecy

Mistake:

Treating weak secrecy ( $\frac{1}{n}I(M; Z^n) \to 0$ ) as sufficient for practical applications.

Correction:

Under weak secrecy, the total information leakage $I(M; Z^n)$ can grow as $o(n)$ — the eavesdropper may eventually learn a sublinear but unbounded number of bits. For practical security, strong secrecy ( $I(M; Z^n) \to 0$ ) or even semantic secrecy (for all message distributions) is needed. Fortunately, the secrecy capacity is the same under all three notions.

Gaussian Wiretap Channel: Secrecy Capacity

Explore how the secrecy capacity of the Gaussian wiretap channel depends on the SNR, the noise ratio $\sigma^2_{Z}/\sigma^2_{Y}$ , and compare it with the non-secrecy capacity. Notice how secrecy capacity saturates at high SNR while the regular capacity grows without bound.

Parameters

Max SNR (dB)30

Maximum SNR for the x-axis

\sigma_Z^2 / \sigma_Y^2

4

Ratio of eavesdropper noise to main channel noise

Quick Check

The secrecy capacity of the degraded wiretap channel is $C_s = \max_{P_X}[I(X;Y) - I(X;Z)]$ . What happens when the eavesdropper's channel is better than the main channel?

$C_s = 0$ — no positive secrecy rate is achievable

The secrecy capacity is negative

We can still achieve secrecy by using more power

Correction:

C_s = 0

— no positive secrecy rate is achievable

When $I(X;Z) \geq I(X;Y)$ for all $P_X$ , the maximum of $I(X;Y) - I(X;Z)$ is non-positive. Since $C_s \geq 0$ by convention (we can always achieve rate 0), $C_s = 0$ .

Quick Check

What is the role of the stochastic encoder in the wiretap channel?

It confuses the eavesdropper by randomly choosing among multiple codewords for each message

It improves reliability at the legitimate receiver

It implements a key agreement protocol

Correction:

It confuses the eavesdropper by randomly choosing among multiple codewords for each message

For each message $M$ , the encoder randomly selects one of $2^{n\\tilde{R}_l}$ codewords. This randomization ensures that the eavesdropper sees a different codeword each time the same message is sent, making it impossible to determine the message from the wiretap channel output.

🔧Engineering Note

Physical-Layer Security in 5G NR

5G NR does not currently use physical-layer security (PLS) for data confidentiality — all security relies on cryptographic protocols (AES-128/256 at the PDCP layer). However, PLS techniques based on the wiretap channel model are being investigated for several use cases:

Key generation: Channel reciprocity in TDD systems allows Alice and Bob to extract a shared secret key from the channel state, which Eve cannot replicate due to spatial decorrelation.
Artificial noise for anti-eavesdropping: Massive MIMO beamforming can direct information to the intended receiver while flooding the eavesdropper with noise.
Authentication: Physical-layer features (channel response, hardware impairments) can authenticate devices without cryptographic overhead.

The information-theoretic results in this chapter provide the fundamental limits for all these applications.

Practical Constraints

•
5G NR currently relies on AES-128/256 encryption at PDCP layer
•
PLS requires CSI at the transmitter, which may not be available in FDD systems
•
Practical PLS schemes must handle imperfect CSI and active attackers

Key Takeaway

The wiretap channel achieves secrecy by exploiting the eavesdropper's channel disadvantage. The secrecy capacity $C_s = \max_{P_X}[I(X;Y) - I(X;Z)]$ is the information-theoretic limit on secret communication without shared keys. The achievability proof uses stochastic encoding (random binning), the same technique that appears in Slepian–Wolf coding — a unifying theme in multiuser information theory.

Why This Matters: Physical-Layer Security in Wireless Networks

The wiretap channel model maps directly to wireless communication: the transmitter (Alice) sends to a legitimate receiver (Bob) while an eavesdropper (Eve) listens. In wireless channels, the channel gains to different receivers are inherently different due to path loss, fading, and spatial separation. This natural channel variation provides the "advantage" needed for information-theoretic secrecy.

Massive MIMO amplifies this advantage dramatically: with many antennas, the transmitter can beamform the signal to Bob while creating a null in Eve's direction, achieving very high secrecy rates. See Book telecom, Chapter 17 for the MIMO beamforming foundations.

The Wiretap Channel: Stochastic Encoding for Secrecy

Animates the wiretap channel model with Alice, Bob, and Eve. Shows how stochastic encoding (multiple codewords per message) confuses the eavesdropper while allowing reliable decoding at the legitimate receiver.

The Wiretap Channel

Why Information-Theoretic Secrecy?

Definition: The Wiretap Channel

Wiretap channel

Secrecy capacity

Equivocation rate

Definition: Weak and Strong Secrecy

Theorem: Secrecy Capacity of the Degraded Wiretap Channel

Achievability — Codebook generation

Achievability — Encoding

Achievability — Decoding and reliability

Achievability — Secrecy analysis

Converse

Random Binning: From Source Coding to Secrecy

Historical Note: Wyner's Wiretap Channel (1975)

Example: BSC Wiretap Channel

Verify degradedness

Compute mutual informations

Secrecy capacity

Example: Gaussian Wiretap Channel

Mutual informations with Gaussian input

Secrecy capacity

Interpretation

Common Mistake: Zero Secrecy Capacity Means No Security

Common Mistake: Confusing Weak and Strong Secrecy

Gaussian Wiretap Channel: Secrecy Capacity

Parameters

Quick Check

Quick Check

Physical-Layer Security in 5G NR

Key Takeaway

Why This Matters: Physical-Layer Security in Wireless Networks

The Wiretap Channel: Stochastic Encoding for Secrecy

Definition:
The Wiretap Channel

Definition:
Weak and Strong Secrecy