Fundamentals of Error-Correcting Codes
Definition: Block Code
Block Code
An binary block code is a set of binary codewords, each of length bits. The code maps a -bit message to an -bit codeword .
The code rate is , representing the fraction of bits that carry information. The remaining bits provide redundancy for error detection and correction.
A code is linear if the sum (modulo 2) of any two codewords is also a codeword. All codes discussed in this chapter are linear.
Definition: Generator Matrix and Parity-Check Matrix
Generator Matrix and Parity-Check Matrix
For a linear code, the generator matrix is a binary matrix such that every codeword is
The parity-check matrix is an binary matrix satisfying
for every codeword . The matrices are related by .
In systematic form, and (over , ).
The syndrome of a received word identifies the error pattern. If , either no errors occurred or the error pattern is an undetectable codeword.
Definition: Hamming Distance
Hamming Distance
The Hamming distance between two binary vectors is the number of positions in which they differ:
The minimum distance of the code is
For a linear code, equals the minimum Hamming weight (number of ones) among all non-zero codewords.
Theorem: Error Detection and Correction Capability
A code with minimum distance can:
- Detect up to errors.
- Correct up to errors.
Equivalently, a code can correct errors if and only if .
If all codewords are at least apart, then errors move a codeword to a point that is still closer to the original codeword than to any other, provided . The decoder picks the nearest codeword and recovers the original.
Detection capability
An error pattern of weight cannot transform one codeword into another (since codewords are at least apart), so the error is always detectable.
Correction capability
If , then the received word satisfies , while for any other codeword :
by the triangle inequality. Hence the nearest-codeword decoder uniquely recovers .
Hamming Sphere Packing Geometry
Hamming Code Error Correction Demo
Explore how Hamming codes detect and correct errors. Adjust the code parameters to see the generator matrix, parity-check matrix, minimum distance, and a simulation of error correction performance.
Parameters
Example: The (7, 4) Hamming Code
The Hamming code has the systematic generator matrix
(a) Find the parity-check matrix .
(b) Determine and the error-correcting capability.
(c) Encode the message and decode the received word (which has a single bit error).
Parity-check matrix
From , we get :
Minimum distance
The minimum-weight non-zero codeword has weight 3 (e.g., with weight 3). Hence and : the code corrects any single-bit error.
Encoding and decoding
Encoding: .
Actually computing row by row mod 2: (mod 2).
Wait β let us compute carefully: Row 1: , Row 3: , Row 4: . Sum mod 2: .
Syndrome: . With ... but this equals , so β no error detected.
Suppose instead (error in position 5). Then , which is column 5 of . The decoder flips bit 5, recovering .
Definition: Coding Gain
Coding Gain
The coding gain of a code is the reduction in (in dB) required to achieve a target BER compared to uncoded transmission:
For a binary code with rate and minimum distance , the asymptotic coding gain (at high SNR) is
In dB: .
Soft-decision decoding provides approximately 2-3 dB additional coding gain compared to hard-decision decoding, effectively doubling in terms of effective Euclidean distance.
Quick Check
A linear block code has minimum distance . How many errors can it correct?
2
3
6
7
. The code can correct any pattern of 3 or fewer errors.
Common Mistake: Hard vs. Soft Decision Decoding
Mistake:
Using hard-decision decoding (quantising channel output to 0/1 before decoding) and expecting the same performance as soft-decision decoding.
Correction:
Hard-decision decoding discards reliability information from the channel. Soft-decision decoding uses the actual channel output values (e.g., LLRs) and achieves approximately 2-3 dB better performance. For the AWGN channel, this difference stems from the fact that hard decisions lose dB of information at the quantiser.
Modern systems (turbo, LDPC, polar codes) universally use soft-decision decoding.
Hard vs. Soft Decision Decoding
| Property | Hard Decision | Soft Decision |
|---|---|---|
| Input to decoder | Binary bits (0 or 1) | Real-valued LLRs or channel outputs |
| Information preserved | Only bit identity | Bit identity + reliability |
| Typical coding gain loss | Reference (0 dB) | 2-3 dB better than hard |
| Decoder complexity | Lower | Higher (real-valued arithmetic) |
| Example algorithms | Syndrome decoding, algebraic | Viterbi (soft), BCJR, belief propagation |
| Used in modern systems | Rarely | Universally (turbo, LDPC, polar) |
Historical Note: Richard Hamming and Error-Correcting Codes (1950)
1950Richard Hamming, working at Bell Labs, developed the first error-correcting codes in 1950 after becoming frustrated with unreliable relay-based computers that would discard his weekend computing jobs when a single bit error occurred. He asked: "If the machine can detect an error, why can't it locate the position of the error and correct it?" His Hamming code β the first single-error-correcting code β launched the field of coding theory. He also introduced the concepts of Hamming distance and Hamming weight that remain fundamental today.
Key Takeaway
The minimum Hamming distance is the single most important parameter of a block code: it determines the error-correcting capability (), the asymptotic coding gain (), and on fading channels, the diversity order. Soft-decision decoding recovers 2-3 dB that hard-decision decoding discards at the quantiser.
Finite-Precision LLR Representation
In hardware implementations, log-likelihood ratios (LLRs) are represented with finite precision (typically 5-8 bits for the integer + fractional parts). This introduces quantisation effects:
- Saturation: Large-magnitude LLRs are clipped, losing information about very reliable bits. Typical clipping range is or .
- Granularity: Small LLR differences are rounded, affecting decoder convergence for marginal bits.
- Performance impact: Moving from floating-point to 6-bit fixed-point typically costs 0.05-0.1 dB; moving to 4-bit costs 0.2-0.5 dB.
The scale factor between channel LLRs and decoder internal precision must be carefully chosen β too small clips reliable bits, too large wastes resolution on unreliable bits.
- β’
LLR quantisation typically 5-8 bits in hardware decoders
- β’
Clipping range must balance saturation vs. resolution
- β’
4-bit quantisation incurs 0.2-0.5 dB loss; 6-bit costs < 0.1 dB
Block Code
A mapping from -bit messages to -bit codewords (). A linear block code forms a -dimensional subspace of under modulo-2 arithmetic.
Related: Generator Matrix and Parity-Check Matrix, Generator Matrix and Parity-Check Matrix, Hamming Distance
Hamming Distance
The number of positions in which two binary strings of equal length differ. The minimum Hamming distance of a code determines its error detection and correction capability.
Related: Block Code, Minimum Distance, Error Detection and Correction Capability
Coding Gain
The reduction in required to achieve a given BER when using channel coding versus uncoded transmission. Measured in dB.
Related: Rate, Minimum Distance
Parity-Check Matrix
An binary matrix that defines a linear code as its null space: . Used for syndrome-based error detection and decoding.
Related: Generator Matrix and Parity-Check Matrix, Syndrome, Block Code