Polar Codes β€” Channel Polarization

The Polarization Phenomenon

Polar codes, introduced by Erdal Ar\i kan in 2009, are the first (and so far only) family of codes with an explicit construction that provably achieves the capacity of any binary-input symmetric memoryless channel, with encoding and decoding complexity O(Nlog⁑N)O(N \log N).

The idea is beautifully simple: apply a specific linear transformation to NN copies of a channel WW. The resulting NN "virtual" channels polarize β€” as NN grows, each virtual channel becomes either nearly perfect (capacity β‰ˆ1\approx 1) or nearly useless (capacity β‰ˆ0\approx 0). The fraction of good channels approaches the capacity I(W)I(W) of the original channel. Send information on the good channels, send known (frozen) bits on the bad ones.

Definition:

The Polar Transform

Start with N=2mN = 2^m independent copies of a binary-input DMC W:{0,1}β†’YW: \{0,1\} \to \mathcal{Y} with capacity I(W)I(W).

The polar transform applies the NΓ—NN \times N matrix GN=BNFβŠ—m\mathbf{G}_N = \mathbf{B}_N \mathbf{F}^{\otimes m}, where F=(1011)\mathbf{F} = \begin{pmatrix} 1 & 0 \\ 1 & 1 \end{pmatrix} is the basic 2Γ—22 \times 2 kernel, FβŠ—m\mathbf{F}^{\otimes m} is the mm-fold Kronecker product, and BN\mathbf{B}_N is a bit-reversal permutation.

The input bits U1,…,UNU_1, \ldots, U_N are transformed to channel inputs XN=UNGNX^N = U^N \mathbf{G}_N (mod 2 for binary, or over R\mathbb{R} for the Gaussian channel).

Definition:

Bit-Channels (Virtual Channels)

After the polar transform, define the ii-th bit-channel as

WN(i):Uiβ†’(YN,Uiβˆ’1),W_N^{(i)}: U_i \to (Y^N, U^{i-1}),

i.e., the channel from UiU_i to the outputs YNY^N and all previous inputs U1,…,Uiβˆ’1U_1, \ldots, U_{i-1} (which are assumed known at the decoder for successive cancellation).

The symmetric capacity of the ii-th bit-channel is I(WN(i))=I(Ui;YN,Uiβˆ’1)I(W_N^{(i)}) = I(U_i; Y^N, U^{i-1}).

Theorem: Channel Polarization Theorem

For any binary-input symmetric DMC WW with capacity I(W)I(W), the bit-channels {WN(i)}i=1N\{W_N^{(i)}\}_{i=1}^N polarize as Nβ†’βˆžN \to \infty: for any Ξ΄>0\delta > 0,

1N∣{i:I(WN(i))∈(1βˆ’Ξ΄,1]}βˆ£β†’I(W),\frac{1}{N}\left|\left\{i : I(W_N^{(i)}) \in (1-\delta, 1]\right\}\right| \to I(W), 1N∣{i:I(WN(i))∈[0,Ξ΄)}βˆ£β†’1βˆ’I(W).\frac{1}{N}\left|\left\{i : I(W_N^{(i)}) \in [0, \delta)\right\}\right| \to 1 - I(W).

That is, the fraction of "good" bit-channels approaches the capacity I(W)I(W), and the fraction of "bad" bit-channels approaches 1βˆ’I(W)1 - I(W).

The polar transform is a recursive process. At each level, two copies of a channel WW are combined into a "better" channel W+W^+ and a "worse" channel Wβˆ’W^-. The total capacity is preserved: I(W+)+I(Wβˆ’)=2I(W)I(W^+) + I(W^-) = 2I(W). But the capacity splits unevenly, and repeated application drives all channels to the extremes. This is the polarization phenomenon β€” and it is the information-theoretic equivalent of distillation.

Polar code

A capacity-achieving code based on channel polarization. Information bits are sent on the "good" virtual channels (high mutual information), while "frozen" known bits are sent on the bad channels. Encoding is O(Nlog⁑N)O(N \log N) via the Kronecker structure; decoding is O(Nlog⁑N)O(N \log N) via successive cancellation.

Related: Turbo code, LDPC code

Frozen bits

In a polar code, the bits assigned to unreliable (low-capacity) bit-channels. These are set to predetermined values (usually 0) known to both encoder and decoder.

Related: Polar code

Successive Cancellation (SC) Decoding

Complexity: O(Nlog⁑N)O(N \log N) using the recursive structure of the polar transform. The LLR computation at step 1a decomposes into a butterfly network of log⁑2N\log_2 N stages.
Input: Received y\mathbf{y}, frozen set F\mathcal{F}, frozen values
Output: Decoded bits U^1,…,U^N\hat{U}_1, \ldots, \hat{U}_N
1. for i=1i = 1 to NN:
a. Compute Li=log⁑P(YN,U^iβˆ’1∣Ui=0)P(YN,U^iβˆ’1∣Ui=1)L_i = \log \frac{P(Y^N, \hat{U}^{i-1} | U_i = 0)}{P(Y^N, \hat{U}^{i-1} | U_i = 1)} (LLR)
b. if i∈Fi \in \mathcal{F}: set U^i=0\hat{U}_i = 0 (frozen bit)
c. else: set U^i={0ifΒ Liβ‰₯01otherwise\hat{U}_i = \begin{cases} 0 & \text{if } L_i \geq 0 \\ 1 & \text{otherwise} \end{cases}
2. return U^1,…,U^N\hat{U}_1, \ldots, \hat{U}_N

SC decoding is sequential (bit-by-bit), which limits throughput. CRC-aided SC list (CA-SCL) decoding maintains LL candidate paths in parallel and selects the one passing a CRC check, achieving near-ML performance at the cost of O(LNlog⁑N)O(LN\log N) complexity.

Channel Polarization Tree

Watch the recursive polarization of a BEC(Ξ΅=0.5\varepsilon = 0.5): at each level, each channel splits into a worse (Ξ΅βˆ’=2Ξ΅βˆ’Ξ΅2\varepsilon^- = 2\varepsilon - \varepsilon^2) and a better (Ξ΅+=Ξ΅2\varepsilon^+ = \varepsilon^2) channel. After 4 levels (N=16N=16), the channels have clearly polarized toward 0 (good) and 1 (bad).

Channel Polarization Visualization

Observe how the bit-channel capacities I(WN(i))I(W_N^{(i)}) evolve as NN grows. For a BEC with erasure probability Ο΅\epsilon, the capacities split recursively toward 0 and 1. Adjust NN and the channel quality to see the polarization effect.

Parameters
10
0.5

Example: Polar Code for the BEC

Design a polar code of length N=8N = 8 for the BEC with erasure probability Ο΅=0.5\epsilon = 0.5. Which bits are frozen?

Why This Matters: Polar Codes in 5G NR

Polar codes were adopted for the 5G NR control channels (PDCCH, PUCCH, PBCH) in 2016, making them the first provably capacity-achieving code family deployed in a commercial standard. The 5G implementation uses CRC-aided SC list (CA-SCL) decoding with list size L=8L = 8, which provides near-ML performance at short to moderate block lengths (up to 1024 bits).

The choice of polar codes for control channels reflects their excellent performance at short block lengths and low rates β€” exactly the regime where control information operates. For long blocks (data), LDPC codes remain preferred due to their higher throughput and easier parallelization.

See Book telecom, Ch. 24 for the full treatment of 5G NR channel coding.

Historical Note: ArΔ±kan's Breakthrough

Erdal Ar\i kan published the polar coding paper in 2009, and it was immediately recognized as a breakthrough β€” the first explicit, provably capacity-achieving construction with polynomial complexity. The elegance of the proof (martingale convergence!) and the simplicity of the construction (just a Kronecker product of a 2Γ—22 \times 2 matrix) were remarkable.

Ar\i kan received the IEEE Richard W. Hamming Medal in 2010, the Shannon Award in 2018, and the Mustafa Prize in 2019 for this work. The rapid adoption of polar codes in 5G NR (less than 10 years from publication to commercial deployment) is exceptional in the history of coding theory.

Quick Check

In a polar code of length N=16N = 16 for a channel with capacity I(W)=0.75I(W) = 0.75, approximately how many information bits are transmitted?

4

8

12

16