Ferkans — Interactive Telecom Tutor

Beyond the Shannon Limit: Why Blocklength Matters

Shannon's channel coding theorem guarantees that rates arbitrarily close to capacity $C$ are achievable with vanishing error probability --- provided the blocklength $n \to \infty$ . In classical broadband communications with large packets ( $n \gtrsim 10^4$ ), this asymptotic result is an excellent approximation.

However, URLLC services in 5G NR demand end-to-end latencies of 1 ms or less, which restricts the blocklength to $n \approx 100$ -- $500$ channel uses. At such short blocklengths, the Shannon limit is overly optimistic: there is a significant penalty for operating at finite $n$ . This section develops the precise characterisation of that penalty through the normal approximation of Polyanskiy, Poor, and Verd'{u} (2010).

Definition:
Information Density

Consider a memoryless channel with transition kernel $P_{Y|X}$ . For input $X \sim P_X$ and output $Y \sim P_Y$ , the information density is the random variable

$i(X; Y) = \log \frac{P_{Y|X}(Y | X)}{P_Y(Y)},$

where the logarithm is base 2 (so the units are bits). Its expectation is the mutual information:

$I(X; Y) = \mathbb{E}[i(X; Y)].$

The information density captures the per-sample information content, which fluctuates from one channel use to the next.

Definition:
Channel Dispersion

The channel dispersion of a memoryless channel, at the capacity-achieving input distribution $P_X^*$ , is the variance of the information density:

$V = \mathrm{Var}\bigl[i(X; Y)\bigr] \Big|_{P_X = P_X^*}.$

For the real AWGN channel with SNR $\gamma$ :

$V = \frac{\gamma(2 + \gamma)}{2(1 + \gamma)^2} \cdot (\log_2 e)^2 \quad \text{(bits}^2\text{/channel use)}.$

For the complex AWGN channel:

$V = \gamma \cdot \frac{2 + \gamma}{(1 + \gamma)^2} \cdot (\log_2 e)^2.$

Channel dispersion is the information-theoretic analogue of variance in the central limit theorem. A high-dispersion channel pays a larger finite-blocklength penalty because the accumulated information is "noisier" across $n$ uses.

Theorem: Normal Approximation (Polyanskiy-Poor-Verd'u)

Interpreting the Dispersion Penalty

The normal approximation reveals a clean structure:

Capacity $C$ is the asymptotic rate, approached only as $n \to \infty$ .
Dispersion penalty $\sqrt{V/n}\, Q^{-1}(\varepsilon)$ decreases as $O(1/\sqrt{n})$ --- convergence to capacity is slow. Halving the gap requires quadrupling the blocklength.
Reliability cost: since $Q^{-1}(\varepsilon)$ increases as $\varepsilon \to 0$ , demanding higher reliability (smaller $\varepsilon$ ) further reduces the achievable rate.

For URLLC with $\varepsilon = 10^{-5}$ and $n = 200$ , the rate loss compared to Shannon capacity can exceed 30%.

Example: Rate Loss at Short Blocklength over AWGN

Consider a complex AWGN channel at SNR $\gamma = 10$ dB. Compute the maximum achievable rate $R^*(n, \varepsilon)$ using the normal approximation for: (a) $n = 100$ , $\varepsilon = 10^{-3}$ ; (b) $n = 100$ , $\varepsilon = 10^{-5}$ ; (c) $n = 1000$ , $\varepsilon = 10^{-5}$ . Express results as a fraction of the Shannon capacity $C$ .

Solution

Shannon capacity and channel dispersion

At $\gamma = 10$ (i.e., 10 dB):

$C = \log_2(1 + \gamma) = \log_2(11) \approx 3.459 \text{ bits/use}.$

The complex AWGN dispersion is

$V = \gamma \cdot \frac{2 + \gamma}{(1+\gamma)^2} \cdot (\log_2 e)^2 = 10 \cdot \frac{12}{121} \cdot (1.4427)^2 \approx 2.065 \text{ bits}^2\text{/use}.$

Case (a): $n = 100$, $\varepsilon = 10^{-3}$

We have $Q^{-1}(10^{-3}) \approx 3.090$ , so

$R^* \approx 3.459 - \sqrt{\frac{2.065}{100}} \cdot 3.090 = 3.459 - 0.1437 \cdot 3.090 \approx 3.459 - 0.444 = 3.015 \text{ bits/use}.$

This is $3.015 / 3.459 \approx 87.2\%$ of Shannon capacity.

Case (b): $n = 100$, $\varepsilon = 10^{-5}$

Now $Q^{-1}(10^{-5}) \approx 4.265$ :

$R^* \approx 3.459 - 0.1437 \cdot 4.265 \approx 3.459 - 0.613 = 2.846 \text{ bits/use},$

which is $82.3\%$ of capacity. Tightening reliability from $10^{-3}$ to $10^{-5}$ costs an additional $\approx 5\%$ of capacity.

Case (c): $n = 1000$, $\varepsilon = 10^{-5}$

$R^* \approx 3.459 - \sqrt{\frac{2.065}{1000}} \cdot 4.265 = 3.459 - 0.04544 \cdot 4.265 \approx 3.459 - 0.194 = 3.265 \text{ bits/use},$ $which is$ 94.4% $of capacity. Increasing$ n $by a factor of 10 brings us much closer to the Shannon limit, illustrating the$ O(1/\sqrt{n})$ convergence.

Finite Blocklength: Rate Penalty as Error Probability Varies

Watch how the achievable rate

R^*(n, \varepsilon)

converges to Shannon capacity as blocklength

n

grows, with curves for different target error probabilities from

10^{-1}

to

10^{-7}

.

The dispersion penalty

\sqrt{V/n}\,Q^{-1}(\varepsilon)

creates a significant rate gap at short blocklength, especially at stringent reliability targets (

\varepsilon = 10^{-5}

--

10^{-7}

).

Rate vs Blocklength under Normal Approximation

Explore how the maximum achievable rate $R^*(n, \varepsilon)$ varies with blocklength $n$ for different target error probabilities $\varepsilon$ . The plot shows the normal approximation curves alongside the Shannon capacity (dashed horizontal line). Observe that (i) convergence to $C$ is slow ( $O(1/\sqrt{n})$ ), (ii) stricter reliability requirements (smaller $\varepsilon$ ) shift the curve downward, and (iii) at short blocklengths ( $n < 200$ ), the rate penalty can exceed 20--40% of capacity.

Parameters

SNR (dB)10

\log_{10} \varepsilon

-5

Theorem: Finite Blocklength under Quasi-Static Fading

,

Quick Check

The channel dispersion $V$ of a complex AWGN channel at high SNR ( $\gamma \gg 1$ ) behaves approximately as $V \approx (\log_2 e)^2$ . What does this imply about the finite blocklength penalty at high SNR?

The penalty $\sqrt{V/n}\, Q^{-1}(\varepsilon)$ becomes independent of SNR, so increasing SNR only shifts the capacity but does not reduce the rate gap

The penalty vanishes at high SNR because the channel becomes deterministic

The penalty grows without bound because $V$ increases linearly with SNR

The penalty depends on $V$ only through the ratio $V/C^2$ , which vanishes

Correction:

The penalty

\sqrt{V/n}\, Q^{-1}(\varepsilon)

becomes independent of SNR, so increasing SNR only shifts the capacity but does not reduce the rate gap

At high SNR, $V \to (\log_2 e)^2 \approx 2.08$ bits $^2$ /use, which is a constant. Thus the absolute rate penalty $\sqrt{V/n}\, Q^{-1}(\varepsilon)$ saturates. However, since $C$ grows as $\log_2(\gamma)$ , the relative penalty (as a fraction of $C$ ) decreases with SNR.

Achievability and Converse Bounds

The normal approximation is sandwiched between two non-asymptotic bounds:

Random Coding Union (RCU) bound (achievability): for any codebook size $M$ and optimal ML decoder,

$\varepsilon \leq \mathbb{E}\!\left[\min\!\left(1,\; (M-1)\, \Pr\!\big[i(\bar{X}^n; Y^n) \geq i(X^n; Y^n) \,\big|\, X^n, Y^n\big]\right)\right],$

where $\bar{X}^n$ is an independent codeword drawn from the same distribution.
Meta-converse bound (converse): the error probability of any $(n, M, \varepsilon)$ code is bounded below via a binary hypothesis testing formulation:

$\varepsilon \geq \sup_{\tilde{P}_{Y^n}} \left\{ 1 - \frac{1}{M} \cdot \frac{1}{\beta_{1-\varepsilon} (P_{Y^n|X^n}, \tilde{P}_{Y^n})} \right\},$

where $\beta_{1-\varepsilon}$ is the minimum type-II error probability at level $1-\varepsilon$ .

For the AWGN channel, these bounds are remarkably tight: the gap between achievability and converse is less than 0.5 dB even at $n = 100$ .

Quick Check

To halve the gap between $R^*(n, \varepsilon)$ and the Shannon capacity $C$ (at a fixed $\varepsilon$ ), how must the blocklength $n$ change?

Double it ( $2n$ )

Quadruple it ( $4n$ )

Square it ( $n^2$ )

It depends on $\varepsilon$ and cannot be determined

Correction:

Quadruple it (

4n

)

The gap is $\sqrt{V/n}\, Q^{-1}(\varepsilon)$ , which scales as $1/\sqrt{n}$ . To halve this gap, we need $1/\sqrt{n'} = \frac{1}{2} \cdot 1/\sqrt{n}$ , giving $n' = 4n$ . This slow convergence is a fundamental challenge for URLLC, where latency constraints severely limit $n$ .

Historical Note: From Shannon to Polyanskiy: 62 Years to the Finite Blocklength Answer

1948--2010

Shannon's 1948 paper established channel capacity as the supremum of achievable rates with vanishing error probability as $n \to \infty$ . But the question "how fast does the achievable rate approach $C$ ?" proved remarkably difficult.

Strassen (1962) showed that the convergence rate is $O(1/\sqrt{n})$ and identified the channel dispersion for the DMC, but the result attracted limited attention at the time. Hayashi (2009) independently developed second-order asymptotics using the information spectrum method.

The breakthrough came with Polyanskiy, Poor, and Verdú (2010), who provided tight achievability (random coding union bound) and converse (meta-converse) bounds that pinned down $R^*(n, \varepsilon)$ to within fractions of a dB for the AWGN channel. Their normal approximation $R^* \approx C - \sqrt{V/n}\,Q^{-1}(\varepsilon)$ became the standard engineering formula for short-packet design and directly enabled the analytical framework for 5G URLLC.

The finite blocklength paradigm fundamentally changed how wireless engineers think about latency: it showed that reliability at short blocklength has a precise, quantifiable cost in rate, governed by the channel dispersion.

Common Mistake: Using Shannon Capacity to Design Short-Packet Systems

Mistake:

"The AWGN channel at $\text{SNR} = 0$ dB has capacity $C = 1$ bit/channel use, so I can reliably transmit at rate $R = 0.9$ bits/use with a code of blocklength $n = 100$ ."

Correction:

At $n = 100$ and target error probability $\varepsilon = 10^{-5}$ , the normal approximation gives:

$R^*(100, 10^{-5}) \approx 1 - \sqrt{\frac{V}{100}} \cdot Q^{-1}(10^{-5})$

For the AWGN channel at $\text{SNR} = 0$ dB, the dispersion is $V \approx 1.0$ bit $^2$ /channel use, and $Q^{-1}(10^{-5}) \approx 4.26$ . Therefore:

$R^* \approx 1 - \sqrt{0.01} \cdot 4.26 = 1 - 0.426 = 0.574 \text{ bits/use.}$

The actual achievable rate is 0.574 bits/use, not 0.9. The 0.426 bits/use gap is the dispersion penalty, which is enormous at short blocklength. Designing at $R = 0.9$ would result in error probability $\varepsilon \gg 10^{-1}$ , not $10^{-5}$ . The Shannon limit is only a useful design target when $n > 10{,}000$ .

Channel Dispersion

The variance of the information density $\imath(X; Y) = \log \frac{p_{Y|X}(Y|X)}{p_Y(Y)}$ under the capacity-achieving input distribution. Denoted $V$ , it governs the $O(1/\sqrt{n})$ penalty in the normal approximation. For the AWGN channel at SNR $\gamma$ : $V = \frac{\gamma(2 + \gamma)}{2(1+\gamma)^2} (\log_2 e)^2$ .

Normal Approximation (Polyanskiy-Poor-Verdú)

The second-order asymptotic expansion $R^*(n, \varepsilon) \approx C - \sqrt{V/n}\,Q^{-1}(\varepsilon) + O(\log n / n)$ . Accurate to within 0.5 dB for $n \geq 100$ on the AWGN channel.

Blocklength

The number of channel uses $n$ allocated to encoding a single message. In OFDM-based systems, $n$ equals the number of resource elements assigned to the codeword. Shorter blocklength reduces latency but increases the rate penalty $\sqrt{V/n}\,Q^{-1}(\varepsilon)$ .

Related: Information Density

Key Takeaway

The dispersion penalty is the price of low latency. The normal approximation $R^*(n, \varepsilon) \approx C - \sqrt{V/n}\,Q^{-1}(\varepsilon)$ shows that short-packet communication operates fundamentally below Shannon capacity. The gap scales as $1/\sqrt{n}$ : halving the gap requires quadrupling the blocklength. For URLLC with $\varepsilon = 10^{-5}$ and $n = 100$ , the rate loss can exceed 40% of capacity — a fact that Shannon's asymptotic theory cannot predict.

The Finite Blocklength Regime

Beyond the Shannon Limit: Why Blocklength Matters

Definition: Information Density

Definition: Channel Dispersion

Theorem: Normal Approximation (Polyanskiy-Poor-Verd'u)

Interpreting the Dispersion Penalty

Example: Rate Loss at Short Blocklength over AWGN

Shannon capacity and channel dispersion

Case (a): $n = 100$, $\varepsilon = 10^{-3}$

Case (b): $n = 100$, $\varepsilon = 10^{-5}$

Case (c): $n = 1000$, $\varepsilon = 10^{-5}$

Finite Blocklength: Rate Penalty as Error Probability Varies

Rate vs Blocklength under Normal Approximation

Parameters

Theorem: Finite Blocklength under Quasi-Static Fading

Quick Check

Achievability and Converse Bounds

Quick Check

Historical Note: From Shannon to Polyanskiy: 62 Years to the Finite Blocklength Answer

Common Mistake: Using Shannon Capacity to Design Short-Packet Systems

Channel Dispersion

Normal Approximation (Polyanskiy-Poor-Verdú)

Blocklength

Key Takeaway

Definition:
Information Density

Definition:
Channel Dispersion