Hybrid Beamforming Architectures

Why Not Fully Digital at mmWave?

In a conventional MIMO system at sub-6 GHz, every antenna element has its own dedicated RF chain (mixer, ADC/DAC, filter). At mmWave, two factors make this approach prohibitively expensive:

  1. Power consumption: A single high-speed ADC at 1+ GS/s consumes 200–500 mW. With Nt=256N_t = 256 antennas, the ADC power alone would exceed 50–130 W β€” comparable to the total base station power budget.
  2. Cost: mmWave RF chains require wideband components (mixers, filters, amplifiers) operating at 28–71 GHz, which are significantly more expensive than their sub-6 GHz counterparts.

Hybrid beamforming addresses this by splitting the precoding into a high-dimensional analog stage (implemented with phase shifters) and a low-dimensional digital stage (implemented in baseband with a small number of RF chains). This reduces the number of RF chains from NtN_t to NRFβ‰ͺNtN_{\text{RF}} \ll N_t while approaching the spectral efficiency of fully digital precoding.

The key insight is that mmWave channels are spatially sparse: propagation occurs through a small number of clusters (typically 2–5 in NLOS), so the channel matrix has low effective rank. This sparsity means that NRFβ‰₯2NsN_{\text{RF}} \geq 2N_s RF chains (where NsN_s is the number of streams) suffice to approach optimal performance.

Definition:

Hybrid Analog-Digital Precoding Architecture

In a hybrid beamforming system with NtN_t transmit antennas, NRFN_{\text{RF}} RF chains, and Ns≀NRF≀NtN_s \leq N_{\text{RF}} \leq N_t data streams, the transmitted signal is:

x=FRF FBB s\mathbf{x} = \mathbf{F}_{\text{RF}}\,\mathbf{F}_{\text{BB}}\,\mathbf{s}

where:

  • s∈CNs\mathbf{s} \in \mathbb{C}^{N_s} is the symbol vector with E[ssH]=PNsI\mathbb{E}[\mathbf{s}\mathbf{s}^H] = \frac{P}{N_s}\mathbf{I},
  • FBB∈CNRFΓ—Ns\mathbf{F}_{\text{BB}} \in \mathbb{C}^{N_{\text{RF}} \times N_s} is the digital (baseband) precoder,
  • FRF∈CNtΓ—NRF\mathbf{F}_{\text{RF}} \in \mathbb{C}^{N_t \times N_{\text{RF}}} is the analog (RF) precoder, implemented with phase shifters.

The unit-modulus constraint on the analog precoder requires:

∣[FRF]i,j∣=1Ntβˆ€β€…β€Ši,j|[\mathbf{F}_{\text{RF}}]_{i,j}| = \frac{1}{\sqrt{N_t}} \quad \forall\; i, j

This constraint arises because each phase shifter can only change the phase, not the amplitude, of the signal.

Two principal architectures exist:

Fully connected: Every RF chain connects to every antenna through a dedicated phase shifter. This requires NtΓ—NRFN_t \times N_{\text{RF}} phase shifters but provides maximum beamforming flexibility.

Sub-connected (partially connected): Each RF chain connects to a disjoint subset of Nt/NRFN_t / N_{\text{RF}} antennas. This requires only NtN_t phase shifters total but restricts the analog beamforming to block-diagonal structure:

FRFsub=blkdiag ⁣(f1,f2,…,fNRF)\mathbf{F}_{\text{RF}}^{\text{sub}} = \text{blkdiag}\!\left(\mathbf{f}_1, \mathbf{f}_2, \ldots, \mathbf{f}_{N_{\text{RF}}}\right)

where fi∈CNt/NRF\mathbf{f}_i \in \mathbb{C}^{N_t/N_{\text{RF}}} is the analog beamforming vector for the ii-th sub-array.

Hybrid Beamforming Architectures

Hybrid Beamforming Architectures
(a) Fully connected hybrid architecture: each of the NRFN_\text{RF} RF chains connects to all NtN_t antennas through phase shifters, requiring NtNRFN_t N_\text{RF} phase shifters in total. (b) Sub-connected (partially connected) architecture: each RF chain drives a disjoint sub-array of Nt/NRFN_t / N_\text{RF} elements, requiring only NtN_t phase shifters. The sub-connected architecture trades beamforming flexibility for reduced hardware complexity and power.

Sparse mmWave Channel Model

The mmWave MIMO channel between a transmitter with NtN_t antennas and a receiver with NrN_r antennas is well modelled by the Saleh-Valenzuela geometric channel:

H=NtNrLβˆ‘β„“=1Lαℓ ar(Ο•β„“r,ΞΈβ„“r) atH(Ο•β„“t,ΞΈβ„“t)\mathbf{H} = \sqrt{\frac{N_t N_r}{L}} \sum_{\ell=1}^{L} \alpha_\ell\, \mathbf{a}_r(\phi_\ell^r, \theta_\ell^r)\, \mathbf{a}_t^H(\phi_\ell^t, \theta_\ell^t)

where LL is the number of scattering paths (clusters), Ξ±β„“\alpha_\ell is the complex gain of the β„“\ell-th path (incorporating path loss and phase), and at(Ο•,ΞΈ)\mathbf{a}_t(\phi, \theta), ar(Ο•,ΞΈ)\mathbf{a}_r(\phi, \theta) are the transmit and receive array response vectors at azimuth Ο•\phi and elevation ΞΈ\theta.

For a uniform planar array (UPA) with WW elements horizontally and HH elements vertically (N=WHN = WH), the array response vector is:

a(Ο•,ΞΈ)=1N[1,ejΟ€sin⁑ϕsin⁑θ,…,ejΟ€(Wβˆ’1)sin⁑ϕsin⁑θ]TβŠ—[1,ejΟ€cos⁑θ,…,ejΟ€(Hβˆ’1)cos⁑θ]T\mathbf{a}(\phi, \theta) = \frac{1}{\sqrt{N}} \left[1, e^{j\pi\sin\phi\sin\theta}, \ldots, e^{j\pi(W-1)\sin\phi\sin\theta}\right]^T \otimes \left[1, e^{j\pi\cos\theta}, \ldots, e^{j\pi(H-1)\cos\theta}\right]^T

At mmWave, LL is typically 2–5 in NLOS (and 1 dominant in LOS), making the channel matrix low-rank and spatially sparse β€” the key property exploited by hybrid beamforming.

OMP-Based Hybrid Precoding (Ayach et al., 2014)

Complexity: The dominant cost is the dictionary search in step 6, requiring O(NRFβ‹…βˆ£Aβˆ£β‹…Ns)\mathcal{O}(N_\text{RF} \cdot |\mathcal{A}| \cdot N_s) complex multiplications, where ∣A∣|\mathcal{A}| is the dictionary size (typically NtN_t for a DFT codebook). The matrix inversion in step 8 costs O(NRF3)\mathcal{O}(N_\text{RF}^3) per iteration. Overall: O(NRF2NtNs+NRF4)\mathcal{O}(N_\text{RF}^2 N_t N_s + N_\text{RF}^4).
Input: Channel matrix H∈CNrΓ—Nt\mathbf{H} \in \mathbb{C}^{N_r \times N_t},
number of RF chains NRFN_\text{RF}, number of streams NsN_s,
dictionary of array response vectors A={at(Ο•i,ΞΈj)}\mathcal{A} = \{\mathbf{a}_t(\phi_i, \theta_j)\}
Output: FRF∈CNtΓ—NRF\mathbf{F}_\text{RF} \in \mathbb{C}^{N_t \times N_\text{RF}},
FBB∈CNRFΓ—Ns\mathbf{F}_\text{BB} \in \mathbb{C}^{N_\text{RF} \times N_s}
1. Compute the optimal unconstrained precoder via SVD:
H=UΞ£VH\mathbf{H} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H;
set Fopt=V:,1:Ns\mathbf{F}_\text{opt} = \mathbf{V}_{:,1:N_s} (first NsN_s right singular vectors)
2. Initialise residual: Fres←Fopt\mathbf{F}_\text{res} \leftarrow \mathbf{F}_\text{opt}
3. Initialise: FRF←[β€…β€Š]\mathbf{F}_\text{RF} \leftarrow [\;] (empty matrix)
4. for i=1,2,…,NRFi = 1, 2, \ldots, N_\text{RF} do
5. Οˆβ†AHFres\quad \boldsymbol{\psi} \leftarrow \mathcal{A}^H \mathbf{F}_\text{res}
\quad // Project residual onto dictionary
6. k⋆←arg⁑max⁑kβˆ₯ψk,:βˆ₯F2\quad k^\star \leftarrow \arg\max_k \|\boldsymbol{\psi}_{k,:}\|_F^2
\quad // Find best-matching column
7. FRF←[FRF∣ak⋆]\quad \mathbf{F}_\text{RF} \leftarrow [\mathbf{F}_\text{RF} \mid \mathbf{a}_{k^\star}]
\quad // Append selected column
8. FBB←(FRFHFRF)βˆ’1FRFHFopt\quad \mathbf{F}_\text{BB} \leftarrow (\mathbf{F}_\text{RF}^H\mathbf{F}_\text{RF})^{-1}\mathbf{F}_\text{RF}^H\mathbf{F}_\text{opt}
\quad // Least-squares digital precoder
9. Fres←Foptβˆ’FRFFBBβˆ₯Foptβˆ’FRFFBBβˆ₯F\quad \mathbf{F}_\text{res} \leftarrow \frac{\mathbf{F}_\text{opt} - \mathbf{F}_\text{RF}\mathbf{F}_\text{BB}}{\|\mathbf{F}_\text{opt} - \mathbf{F}_\text{RF}\mathbf{F}_\text{BB}\|_F}
\quad // Update normalised residual
10. end for
11. Normalise: FBB←Ns FBBβˆ₯FRFFBBβˆ₯F\mathbf{F}_\text{BB} \leftarrow \sqrt{N_s}\,\frac{\mathbf{F}_\text{BB}}{\|\mathbf{F}_\text{RF}\mathbf{F}_\text{BB}\|_F}
\quad // Enforce total power constraint
12. Return FRF\mathbf{F}_\text{RF}, FBB\mathbf{F}_\text{BB}

Convergence and Performance of OMP Hybrid Precoding

The OMP algorithm greedily approximates the optimal unconstrained precoder Fopt\mathbf{F}_\text{opt} by selecting NRFN_\text{RF} columns from a dictionary of array response vectors. Key properties:

  • Near-optimal for sparse channels: When L≀NRFL \leq N_\text{RF} (number of paths ≀\leq number of RF chains), OMP can recover the dominant paths exactly, achieving spectral efficiency within 1–2 dB of the fully digital baseline.
  • Graceful degradation: As NRFN_\text{RF} decreases below 2Ns2N_s, performance degrades smoothly. The rule of thumb NRFβ‰₯2NsN_\text{RF} \geq 2N_s ensures that both the real and imaginary parts of each stream can be independently controlled.
  • Dictionary design matters: The standard DFT codebook works well for ULAs, but UPAs benefit from 2D oversampled DFT dictionaries to capture elevation angles.

Hardware constraints beyond unit-modulus also affect practical implementations:

  • Finite-resolution phase shifters: Commercial phase shifters typically have 4–6 bits of resolution (16–64 discrete phases). Quantising the OMP solution to the nearest discrete phase incurs ∼\sim0.5–1 dB loss at 4 bits and ∼\sim0.1 dB at 6 bits.
  • Insertion loss: Each phase shifter introduces 3–6 dB insertion loss at mmWave, which must be compensated by the PA or accounted for in the link budget.
  • Switching networks: Architectures using switches instead of phase shifters (selection-based hybrid BF) further reduce power but sacrifice beamforming resolution.

Spectral Efficiency of Hybrid vs. Digital Beamforming

The achievable spectral efficiency with hybrid precoding F=FRFFBB\mathbf{F} = \mathbf{F}_\text{RF}\mathbf{F}_\text{BB} and hybrid combining W=WRFWBB\mathbf{W} = \mathbf{W}_\text{RF}\mathbf{W}_\text{BB} is:

Rhybrid=log⁑2 ⁣det⁑ ⁣(I+PNsΟƒn2Rnβˆ’1 H~ H~H)R_\text{hybrid} = \log_2\!\det\!\left(\mathbf{I} + \frac{P}{N_s \sigma_n^2} \mathbf{R}_n^{-1}\,\widetilde{\mathbf{H}}\,\widetilde{\mathbf{H}}^H\right)

where H~=WBBHWRFHHFRFFBB\widetilde{\mathbf{H}} = \mathbf{W}_\text{BB}^H \mathbf{W}_\text{RF}^H \mathbf{H} \mathbf{F}_\text{RF}\mathbf{F}_\text{BB} is the effective channel after hybrid precoding and combining, and Rn=Οƒn2WBBHWRFHWRFWBB\mathbf{R}_n = \sigma_n^2 \mathbf{W}_\text{BB}^H \mathbf{W}_\text{RF}^H \mathbf{W}_\text{RF}\mathbf{W}_\text{BB} is the effective noise covariance.

The fully digital upper bound is:

Rdigital=βˆ‘i=1Nslog⁑2 ⁣(1+PNsΟƒn2 σi2(H))R_\text{digital} = \sum_{i=1}^{N_s} \log_2\!\left(1 + \frac{P}{N_s \sigma_n^2}\,\sigma_i^2(\mathbf{H})\right)

where Οƒi(H)\sigma_i(\mathbf{H}) are the singular values of H\mathbf{H}.

Simulation studies (Heath et al., 2016) consistently show that with NRFβ‰₯2NsN_\text{RF} \geq 2N_s, the fully connected hybrid architecture achieves within 1–3 dB of the fully digital baseline. The sub-connected architecture incurs an additional 1–2 dB penalty. As NRF/Ntβ†’1N_\text{RF}/N_t \to 1, hybrid and digital performance converge identically.

Hybrid vs. Digital Beamforming Spectral Efficiency

Compare the spectral efficiency of hybrid beamforming (both fully connected and sub-connected) against the fully digital baseline as a function of SNR. Adjust the number of antennas, RF chains, and users to explore the trade-off between hardware complexity and performance. The gap between hybrid and digital narrows as NRF/NsN_\text{RF}/N_s increases.

Parameters
64
8
4

Quick Check

In a fully connected hybrid beamforming architecture with Nt=128N_t = 128 antennas and NRF=8N_\text{RF} = 8 RF chains, how many phase shifters are required?

128 phase shifters

1024 phase shifters

8 phase shifters

256 phase shifters

Common Mistake: Dismissing Hybrid BF as "Clearly Inferior" to Fully Digital

Mistake:

Assuming that hybrid beamforming is a temporary compromise that will be replaced by fully digital as hardware improves, and therefore not worth optimising.

Correction:

Hybrid architectures will remain relevant for the foreseeable future at mmWave and sub-THz frequencies. The power consumption of ADCs scales as 22ENOBΓ—fs2^{2\text{ENOB}} \times f_s β€” doubling the sampling rate doubles the power. For 1 GHz bandwidth at 8 ENOB, each ADC consumes ∼\sim500 mW. A 256-element fully digital system would need 128 W in ADCs alone. Even with Moore's law improvements of ∼\sim2Γ—\times per decade for ADC efficiency, fully digital 256-element mmWave systems remain impractical for at least another decade. Meanwhile, hybrid architectures with NRFβ‰₯2NsN_\text{RF} \geq 2N_s achieve within 1–3 dB of fully digital β€” a modest price for an order-of-magnitude reduction in power consumption.

Hybrid Beamforming Architecture Comparison

PropertyFully ConnectedSub-ConnectedFully Digital
Phase shiftersNtΓ—NRFN_t \times N_\text{RF}NtN_t0
RF chainsNRFN_\text{RF}NRFN_\text{RF}NtN_t
Total ADCsNRFN_\text{RF}NRFN_\text{RF}NtN_t
Beam flexibilityFull (any direction per RF chain)Sub-array limitedFull (any beam per element)
Spectral efficiency gap1–3 dB below digital2–5 dB below digitalReference (0 dB)
Power (256 ant, 8 RF)∼\sim8 W∼\sim5 W∼\sim130 W
Hardware costMediumLowVery high
Typical 5G NR useFR2 gNB (28 GHz)FR2 UEFR1 massive MIMO (sub-6)
⚠️Engineering Note

Phase Shifter Technology and Insertion Loss at mmWave

The analog precoding stage of hybrid beamforming relies on phase shifters, whose characteristics critically affect system performance:

  • CMOS passive phase shifters: 4–6 bit resolution, 4–8 dB insertion loss at 28 GHz, 0.5–2 mW power per element. The dominant technology for consumer 5G devices.
  • SiGe active phase shifters: 5–7 bit resolution, 1–3 dB insertion loss (with amplification), 5–15 mW per element. Used in base station panels where power budget is less constrained.
  • MEMS phase shifters: Very low insertion loss (<1< 1 dB) and excellent linearity, but slow switching speed (∼\sim10 ΞΌ\mus) limits beam tracking rate. Used in satellite and radar, not yet competitive for 5G.

The insertion loss of phase shifters is a critical design parameter: with Nt=256N_t = 256 elements and 6 dB insertion loss per phase shifter in a fully connected architecture, the total power delivered to the antennas is reduced by 6 dB, directly impacting EIRP. This loss must be compensated by the PA, which in turn increases power consumption. Sub-connected architectures reduce the number of phase shifters per path, partially mitigating this issue.

Practical Constraints
  • β€’

    CMOS passive phase shifters: 4-8 dB insertion loss at 28 GHz

  • β€’

    Phase quantisation with 4-bit shifters: ~0.5-1 dB array gain loss

  • β€’

    Switching speed: CMOS ~1 ns, MEMS ~10 ΞΌs

Hybrid Beamforming

A beamforming architecture that splits precoding into an analog stage (phase shifters) and a digital stage (baseband processing), reducing the number of RF chains from NtN_t to NRFβ‰ͺNtN_\text{RF} \ll N_t.

Related: Phase Shifter, RF Chain, Precoding in 5G NR and Wi-Fi

Unit-Modulus Constraint

The requirement that each entry of the analog precoding matrix has constant magnitude: ∣[FRF]i,j∣=1/Nt|[\mathbf{F}_\text{RF}]_{i,j}| = 1/\sqrt{N_t}. This arises because phase shifters can only adjust phase, not amplitude.

Related: Hybrid Beamforming, Phase Shifter