HARQ in 5G NR: Redundancy Versions and Processes

5G NR: HARQ in the Real Radio Access Network

We arrive at the practical instantiation. 5G New Radio (NR) Release 15 adopted a HARQ design that directly reflects the ARQ-DMT principle of this chapter: stop-and-wait per-process operation, with up to 16 HARQ processes running in parallel, LDPC mother codes with circular-buffer rate matching driven by RV_0 through RV_3, and a flexible scheduling framework that picks the RV and the MCS per retransmission.

The fine details — the exact LDPC base graphs, the numerology- dependent HARQ RTT, the PUCCH/PDCCH feedback timing, the URLLC mini-slot HARQ — are all in the 3GPP specs (TS 38.212 for coding and TS 38.214 for procedures). We pull out the three design aspects that are information-theoretic consequences of the ARQ-DMT: (1) the choice of $L$ via the latency budget; (2) the choice of per-round MCS via the rate- $r$ /diversity- $d$ tradeoff; (3) the frequency-hopping across rounds to enforce independence.

Definition:
5G NR HARQ Process

A HARQ process in 5G NR is a state machine that tracks the current transmission status of a single transport block (TB). Each UE maintains up to $N_{\rm HARQ} = 16$ parallel HARQ processes (NR supports up to 16 on the physical downlink shared channel PDSCH; 32 on NR-U unlicensed). A process carries:

HARQ process ID — an integer $\in \{0, 1, \ldots, N_{\rm HARQ} - 1\}$ uniquely identifying the slot assignment of this TB.
New-data indicator (NDI) — a toggle bit signalling whether the current transmission is a new TB or a retransmission of the prior TB in this process.
Redundancy version (RV) — the $\mathrm{RV} \in \{0, 1, 2, 3\}$ index specifying the circular-buffer fragment for this retransmission.
Transport block size (TBS) — number of information bits; constant across retransmissions of the same TB.
Soft buffer — LLR accumulator storing the decoder's combined observations from the previous rounds.

Stop-and-wait means: a process cannot send a new TB until the prior TB is acknowledged or abandoned. Parallel processes allow multiple TBs to be in flight simultaneously — the gNB schedules TB $k$ on process $j$ in slot $n$ , and the UE sends the ACK/NACK in slot $n + K_1$ (typically $K_1 = 2$ slots at $\mu = 0$ ). While waiting for ACK/NACK on process $j$ , other processes $j' \ne j$ can transmit fresh TBs — filling the pipeline.

The $N_{\rm HARQ} = 16$ choice reflects a specific tradeoff: with HARQ RTT $\sim 4$ slots and one TB per slot, 4 processes would be enough to keep the pipeline full without considering the variation in transmission times. The extra $12$ processes absorb jitter from scheduling, interference, and handover. Going beyond 16 yields diminishing returns; 16 is a round-number compromise between memory overhead (soft buffers!) and pipeline efficiency.

5G NR HARQ-IR Throughput Envelope

Effective throughput $\eta_\mathrm{eff}$ of a 5G NR IR-HARQ link as a function of the max round budget. Models LDPC mother code of rate $1/3$ , NR-style RV selection (RV_0 first, then RV_2, RV_3, RV_1), and i.i.d. Rayleigh $2 \times 2$ MIMO across rounds. The envelope grows with $L$ , saturating at the no-outage ceiling $R$ at high SNR. At moderate SNR, the incremental benefit per round decreases quickly beyond $L = 4$ — the motivation for the 3GPP default of $L = 4$ retransmissions.

Parameters

Max HARQ rounds4

Definition:
Per-Round Frequency Hopping in 5G NR

To enforce the independence-across-rounds assumption of the ARQ-DMT, 5G NR supports per-round frequency hopping: the physical resource block (PRB) allocation for retransmission $\ell$ can be different from that of round $\ell - 1$ . Hopping is signalled via the 1-bit "frequency hopping flag" in the DCI (downlink control information) and can take one of two patterns:

Inter-slot hopping. Round $\ell$ uses PRBs $\{p_0, \ldots, p_0 + N_\mathrm{PRB} - 1\}$ ; round $\ell + 1$ uses $\{p_1, \ldots, p_1 + N_\mathrm{PRB} - 1\}$ with $p_1 \ne p_0$ . Rounds decorrelate because the frequency offset exceeds the coherence bandwidth $B_\mathrm{coh} \sim 1 / (\pi \tau_\mathrm{rms})$ .
Intra-slot hopping. Within a single slot, the first and second halves of the transmission use different PRBs — providing decorrelation even within a single round for extra frequency-diversity protection.

Operationally, inter-slot hopping is the relevant mechanism for HARQ-round decorrelation: it ensures that $\mathbf{H}_{1}, \ldots, \mathbf{H}_{L}$ are approximately independent even when the user is stationary and the temporal coherence time is large.

🚨Critical Engineering Note

URLLC HARQ: One Retransmission Budget

Ultra-reliable low-latency communication (URLLC) has a target end-to-end latency of $1$ ms at $99.999\%$ reliability (sometimes $99.9999\%$ ). The HARQ budget is correspondingly tight:

Mini-slot scheduling (2–7 symbols instead of the 14 of a normal slot) reduces the per-round transmission time to $\sim 0.125$ – $0.5$ ms.
HARQ RTT target $\le 0.5$ ms in FR2 (millimetre-wave) numerology ( $\mu = 3$ , $0.125$ ms slots).
Max $L = 1$ – $2$ in practice: one retransmission if the budget allows, otherwise no HARQ.
PUCCH blind-decoding overhead ( $\sim 100$ $\mu$ s) eats into the budget.

When $L = 1$ (no retransmission allowed), the ARQ-DMT reduces to the static DMT of Ch. 12. The URLLC reliability target is instead met by aggressive redundancy in a single shot — low-rate coding (mother code $R_m = 1/5$ or lower), large MIMO arrays ( $n_r = 8$ or more), and multiple replicas transmitted on different frequencies within the one allowed transmission. The diversity-multiplexing-delay tradeoff of §2 makes this quantitative: URLLC prefers to spend the "budget" on diversity (low $r$ , high $d^{*}(r)$ ) rather than delay (high $L$ , high $d_\mathrm{ARQ}$ ).

In practice, both mechanisms coexist: URLLC "PDCP duplication" sends the same packet on two independent physical paths (e.g., two base stations, or Wi-Fi + cellular). This is NOT HARQ — it is a separate reliability mechanism layered above HARQ — but it exploits the same underlying principle that independent observations multiply the reliability exponent.

Practical Constraints

•
URLLC target: 1 ms end-to-end latency at 99.999% reliability.
•
Max HARQ rounds in URLLC: $L = 1$ – $2$ (often zero).
•
Mini-slot scheduling required to fit HARQ RTT inside latency budget.
•
PDCP duplication provides path diversity above HARQ.

📋 Ref: 3GPP TS 38.214 §5.3, §11 (URLLC scheduling)

🔧Engineering Note

5G NR HARQ Process Count: Why 16?

NR specifies 16 HARQ processes per UE per cell on the downlink (and similarly on the uplink), up from LTE's 8. The rationale:

LTE (sub-6 GHz, $\mu = 0$ , 1 ms slots, HARQ RTT $\sim 8$ slots): 8 processes fill the pipeline.
NR FR1 (sub-6 GHz, $\mu = 0$ or $1$ ): same pipeline depth as LTE.
NR FR2 (millimetre-wave, $\mu = 2$ or $3$ , $0.25$ - $0.125$ ms slots, HARQ RTT $\sim 16$ slots): requires 16 processes to keep the pipeline full at high throughput.

The tradeoff is that each HARQ process needs its own soft buffer storage (see §4). Sixteen processes at $\sim 400{,}000$ LLRs each $\times 8$ bits $= 3.2$ Mbit of RAM per direction per cell. This is a non-trivial silicon cost for a low-end UE — hence the tiering of UE categories by HARQ-buffer size (LBRM).

A subtle point: HARQ process count does not affect the asymptotic ARQ-DMT; it affects throughput utilisation by preventing head-of-line blocking when multiple TBs are in flight. The diversity benefit comes from the number of rounds per TB ( $L$ ), not from the total number of processes.

Practical Constraints

•
NR supports 16 HARQ processes per UE per direction per cell.
•
Pipeline throughput requires $N_\mathrm{HARQ} \ge \mathrm{ceil}(T_\mathrm{rtt}/T_\mathrm{slot})$ .
•
Process count trades silicon area for pipeline efficiency.

📋 Ref: 3GPP TS 38.214 §5.1 (HARQ processes)

Example: 5G NR HARQ Budget: $\mu = 1$ , eMBB Service

A 5G NR eMBB service operates on numerology $\mu = 1$ (30 kHz subcarrier spacing, 0.5 ms slots). The PDSCH-to-PUCCH delay $K_1 = 2$ slots and the PUSCH-preparation time $K_2 = 2$ slots. What is the HARQ RTT, and how many HARQ processes are needed to keep the pipeline full at 100 Mbps?

Solution

HARQ RTT

$T_\mathrm{rtt} = (K_1 + K_2) \cdot T_\mathrm{slot} + T_\mathrm{proc} \approx 4 \cdot 0.5\text{ ms} + \epsilon \approx 2$ ms (plus sub-ms processing overhead). Within the 2 ms RTT, $\mathrm{ceil}(2\text{ ms}/0.5\text{ ms}) = 4$ slots pass.

Processes needed for pipeline

To transmit a new TB every slot without stalling, we need $N_\mathrm{HARQ} \ge 4$ processes. NR's maximum of 16 comfortably accommodates this with margin for handover, error cases, and jitter.

Latency for $L = 4$ retransmissions

In the worst case of 4 HARQ rounds, latency $= 4 \cdot T_\mathrm{rtt} = 8$ ms — well inside the typical $\sim 20$ - $30$ ms eMBB application-layer budget. The system can comfortably exploit the ARQ-DMT gain of $L = 4$ .

ARQ-DMT gain at $r = 2$

At long-term effective rate $r = 2$ bits/channel use on $2 \times 2$ : $d^{*}(r/L) = d^{*}(0.5) = 2.5$ ; $d_\mathrm{ARQ} (2, 4) = 4 \cdot 2.5 = 10$ . Compared to $d^{*}(2) = 0$ (one- shot at $r_\mathrm{max}$ — literally no reliability at all), this is a qualitative jump: from "barely works" to "works with reliability $\text{SNR}^{-10}$ ".

Common Mistake: ACK/NACK Is Not Error-Free

Mistake:

Treating the ACK/NACK feedback in 5G NR as the noiseless one-bit channel assumed in the ARQ-DMT theorem. In reality, the PUCCH (physical uplink control channel) carries ACK/NACK at a target miss-detection rate of $\sim 10^{-3}$ to $10^{-4}$ .

Correction:

The ARQ-DMT theorem assumes a zero-delay, zero-error ACK/NACK feedback link. In practice, PUCCH HARQ feedback has non-trivial error modes:

NACK-to-ACK (more serious): the UE NACKed but the gNB decoded it as an ACK. The gNB assumes success and does not retransmit; the TB is lost. This is the "silent failure" mode and is typically limited to $\le 10^{-4}$ .
ACK-to-NACK: the UE ACKed but the gNB decoded it as a NACK. The gNB retransmits unnecessarily; no loss, just throughput penalty. Typically $\le 10^{-3}$ .
DTX-to-NACK: no feedback received (UE didn't transmit anything); the gNB typically treats as NACK and retransmits.

Silent-failure rate sets a hard floor on the achievable BLER: even with infinite ARQ rounds, the end-to-end error rate is bounded below by the NACK-to-ACK rate. For URLLC's $10^{-5}$ target, the PUCCH reliability must exceed the data-channel reliability — hence NR's dedicated URLLC PUCCH format with Reed-Muller encoding and repetition.

Historical Note: From LTE HARQ to NR HARQ: The Flexibility Leap

2018

LTE's HARQ (Rel-8, 2009) was a direct realisation of Caire- Tuninetti 2001 and El Gamal-Caire-Damen 2006: LDPC mother codes (actually Turbo in LTE), circular-buffer rate matching, RVs 0–3, stop-and-wait with 8 parallel processes. The design was pragmatic and remained essentially unchanged through LTE-A.

NR (Rel-15, 2018) preserved the core structure but added flexibility along three axes:

Numerology-dependent slots ( $\mu = 0$ to $4$ , slots from 1 ms to 31 $\mu$ s): the same HARQ machinery works across a 200x dynamic range of HARQ RTT.
Mini-slot scheduling (2–7 symbols) for URLLC: a single TB can fit in a fraction of a slot, enabling HARQ RTTs of $\sim 0.2$ ms.
Code-block-group (CBG) re-transmission: instead of retransmitting the entire TB on NACK, NR can retransmit only the failed CBGs — saving air-time on partial failures.

The net effect is that NR HARQ realises the ARQ-DMT across a much wider range of $(r, L, \mathrm{latency})$ tuples than LTE could. The original El Gamal-Caire-Damen paper is cited in many 3GPP contributions on HARQ design — a concrete example of information theory shaping standards at Release-boundary depth.

Quick Check

In 5G NR HARQ, the "silent-failure" ACK/NACK error mode is

NACK-to-ACK: the UE NACKed but the gNB decoded as ACK, leading to no retransmission and an unrecoverable TB

ACK-to-NACK: unnecessary retransmission causes throughput penalty

DTX: the UE failed to transmit anything

The gNB scheduler picks the wrong process ID

Correction:

NACK-to-ACK: the UE NACKed but the gNB decoded as ACK, leading to no retransmission and an unrecoverable TB

Yes. NACK-to-ACK is the critical error mode because the gNB discards the TB assuming success. NR caps this rate at $\le 10^{-4}$ and uses DTX-aware feedback formats to manage it.

Why This Matters: Forward Link: The Full BICM-OFDM-STBC Pipeline

This chapter sets the information-theoretic foundation for HARQ in cellular systems. Chapter 21 will put the HARQ mechanism in its full physical-layer context: BICM (Chs. 5–9), HARQ (this chapter), and OFDM-STBC (Ch. 22), composed into the transmit / receive chain of 5G NR. The ARQ-DMT tells us what the pipeline could achieve; Chapter 21 tells us what it actually does and where the gap comes from. Particular topics for Ch. 21: how rate matching interacts with MCS-adaptation at the scheduler level, how frequency-selective scheduling on an OFDM-subcarrier basis further decorrelates HARQ rounds, and how link adaptation (outer-loop link adaptation, OLLA) adjusts the MCS target based on the HARQ residual error rate.

HARQ Process

A state machine tracking the current transmission status of a single transport block in LTE/NR. Up to $N_{\rm HARQ} = 16$ processes operate in parallel per UE per direction, each carrying its own soft buffer, NDI, RV, and TBS. Enables pipelined transmission without head-of-line blocking while preserving stop-and-wait semantics per process.

Redundancy Version (RV)

An integer index $\in \{0, 1, 2, 3\}$ specifying the starting offset of the circular-buffer fragment transmitted in a given HARQ round (3GPP TS 38.212 §5.4.2). RV_0 includes the systematic bits of the LDPC codeword and is typically transmitted first; RV_2 is chosen from the far side of the buffer for maximum incremental-redundancy coverage on the second round.

Practical HARQ: Chase Combining vs Incremental Redundancy Chapter Summary