Ferkans — Interactive Telecom Tutor

From Duality to Optimality Conditions

The KKT conditions are the "first-order necessary and sufficient conditions" for convex problems with strong duality. They unify unconstrained optimality ( $\nabla f = 0$ ) with constraint handling. The most celebrated application in wireless is water-filling: optimal power allocation across parallel channels.

Theorem: Karush–Kuhn–Tucker (KKT) Conditions

Consider a convex problem with differentiable $f_0, f_1, \ldots, f_m$ and affine equality constraints. If strong duality holds (e.g., Slater's condition), then $\mathbf{x}^\star$ is optimal if and only if there exist $\boldsymbol{\lambda}^\star \succeq 0$ and $\boldsymbol{\nu}^\star$ such that:

Stationarity: $\nabla f_0(\mathbf{x}^\star) + \sum_{i=1}^m \lambda_i^\star \nabla f_i(\mathbf{x}^\star) + \sum_{j=1}^p \nu_j^\star \nabla h_j(\mathbf{x}^\star) = \mathbf{0}$
Primal feasibility: $f_i(\mathbf{x}^\star) \leq 0$ , $h_j(\mathbf{x}^\star) = 0$
Dual feasibility: $\lambda_i^\star \geq 0$
Complementary slackness: $\lambda_i^\star f_i(\mathbf{x}^\star) = 0$ for all $i$

Complementary slackness says: either a constraint is tight ( $f_i = 0$ ) or its multiplier is zero ( $\lambda_i = 0$ ). An inactive constraint exerts no "force" on the optimal solution. This is the key to water-filling: channels with too-poor quality get zero power (their constraint is inactive).

Show Hint

Start from the fact that $\mathbf{x}^\star$ minimises the Lagrangian over $\mathbf{x}$ .

Use strong duality to show $\mathcal{L}(\mathbf{x}^\star, \boldsymbol{\lambda}^\star, \boldsymbol{\nu}^\star) = f_0(\mathbf{x}^\star)$ .

Proof

Necessity

Since strong duality holds, $f_0(\mathbf{x}^\star) = g(\boldsymbol{\lambda}^\star, \boldsymbol{\nu}^\star) = \inf_{\mathbf{x}} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}^\star, \boldsymbol{\nu}^\star)$ .

But $\mathcal{L}(\mathbf{x}^\star, \boldsymbol{\lambda}^\star, \boldsymbol{\nu}^\star) = f_0(\mathbf{x}^\star) + \sum_i \lambda_i^\star f_i(\mathbf{x}^\star) \leq f_0(\mathbf{x}^\star)$

(since $\lambda_i^\star \geq 0$ and $f_i(\mathbf{x}^\star) \leq 0$ ). Combined with the infimum bound, we get equality everywhere.

Complementary slackness

The equality $\sum_i \lambda_i^\star f_i(\mathbf{x}^\star) = 0$ with $\lambda_i^\star \geq 0$ and $f_i(\mathbf{x}^\star) \leq 0$ forces each product $\lambda_i^\star f_i(\mathbf{x}^\star) = 0$ .

Stationarity

Since $\mathbf{x}^\star$ minimises the (convex) Lagrangian over $\mathbf{x}$ , the gradient must vanish at $\mathbf{x}^\star$ : $\nabla_{\mathbf{x}} \mathcal{L}(\mathbf{x}^\star, \boldsymbol{\lambda}^\star, \boldsymbol{\nu}^\star) = \mathbf{0}$ . $\blacksquare$

,

Historical Note: The KKT Conditions

1939–1951

William Karush derived these conditions in his 1939 master's thesis at the University of Chicago, but the work went largely unnoticed. Harold Kuhn and Albert Tucker independently rediscovered them in 1951. The conditions were initially named "Kuhn–Tucker"; Karush's priority was recognised only in the 1970s.

Definition:
Water-Filling Problem

Consider $N$ parallel sub-channels with gains $g_1, \ldots, g_N > 0$ and noise powers $\sigma_1^2, \ldots, \sigma_N^2$ . The goal is to maximise the total rate (sum capacity):

$\begin{aligned} \text{maximise} \quad & \sum_{i=1}^N \log_2\!\left(1 + \frac{g_i p_i}{\sigma_i^2}\right) \\ \text{subject to} \quad & \sum_{i=1}^N p_i \leq P_{\text{tot}}, \quad p_i \geq 0 \end{aligned}$

where $p_i$ is the power allocated to sub-channel $i$ and $P_{\text{tot}}$ is the total power budget.

Theorem: Water-Filling Solution

The optimal power allocation for the water-filling problem is

$p_i^\star = \left(\mu - \frac{\sigma_i^2}{g_i}\right)^+$

where $(x)^+ = \max(x, 0)$ and the water level $\mu > 0$ is chosen so that $\sum_{i=1}^N p_i^\star = P_{\text{tot}}$ .

The quantity $\sigma_i^2 / g_i$ is the "noise floor" of sub-channel $i$ . The solution "pours water" up to a common level $\mu$ : channels with high noise floors may receive zero power.

Imagine sub-channels as vessels of different heights (the noise floor $\sigma_i^2 / g_i$ ). Pouring a fixed volume of water (the power budget $P_{\text{tot}}$ ) fills the vessels to a common water level $\mu$ . Tall vessels (poor channels) may not receive any water if the total budget is small.

Show Hint

Write the Lagrangian with multiplier $\mu$ for the power constraint and $\lambda_i$ for $p_i \geq 0$ .

Apply the KKT stationarity condition: $\partial \mathcal{L}/\partial p_i = 0$ .

Use complementary slackness: either $p_i > 0$ or $\lambda_i > 0$ .

Proof

Form the Lagrangian

$\mathcal{L}(\mathbf{p}, \mu, \boldsymbol{\lambda}) = -\sum_i \log_2(1 + g_i p_i / \sigma_i^2) + \mu(\sum_i p_i - P_{\text{tot}}) - \sum_i \lambda_i p_i$

(We minimise the negative of the rate.)

KKT stationarity

$\frac{\partial \mathcal{L}}{\partial p_i} = 0 \implies \frac{g_i / \sigma_i^2}{(1 + g_i p_i / \sigma_i^2) \ln 2} = \mu - \lambda_i$

If $p_i > 0$ , then $\lambda_i = 0$ (complementary slackness), so $p_i = \frac{1}{\mu \ln 2} - \frac{\sigma_i^2}{g_i}$ .

Threshold channels

If $\frac{1}{\mu \ln 2} < \frac{\sigma_i^2}{g_i}$ , then $p_i = 0$ (the channel is too noisy to deserve power). Absorbing $\ln 2$ into $\mu$ , we write $p_i^\star = (\mu - \sigma_i^2/g_i)^+$ where $\mu$ is determined by the total power constraint. $\blacksquare$

,

Example: Water-Filling with 4 Sub-Channels

Given 4 OFDM sub-channels with noise floors $\sigma_i^2/g_i = [0.5, 1.0, 2.0, 3.5]$ and total power $P_{\text{tot}} = 4$ , find the optimal power allocation.

Solution

Try all channels active

Assume $p_i > 0$ for all $i$ . Then $\mu - 0.5 + \mu - 1.0 + \mu - 2.0 + \mu - 3.5 = 4$ , so $4\mu = 4 + 7.0 = 11$ , giving $\mu = 2.75$ .

Check positivity

$p_1 = 2.75 - 0.5 = 2.25$ , $p_2 = 2.75 - 1.0 = 1.75$ , $p_3 = 2.75 - 2.0 = 0.75$ , $p_4 = 2.75 - 3.5 = -0.75 < 0$ . Channel 4 gets negative power — contradiction.

Remove channel 4

Set $p_4 = 0$ . Repeat with 3 active channels: $3\mu = 4 + 3.5 = 7.5$ , so $\mu = 2.5$ . $p_1 = 2.0$ , $p_2 = 1.5$ , $p_3 = 0.5$ , $p_4 = 0$ . All non-negative. $\checkmark$

Compute rates

$R_i = \log_2(1 + g_i p_i / \sigma_i^2) = \log_2(\mu g_i / \sigma_i^2)$ for active channels. The total rate is $R = \sum_{i=1}^3 \log_2(2.5 / (\sigma_i^2/g_i)) = \log_2(5) + \log_2(2.5) + \log_2(1.25) \approx 4.97$ bits/s/Hz. $\blacksquare$

Water-Filling Animation

Watch water pour into vessels of different heights as the total power budget increases. Observe how channels with high noise floors receive zero power until the water level rises enough.

Parameters

Number of sub-channels5

Noise profile

Water-Filling Power Allocation Sweep

Watch the water level

\mu

rise as the total power budget increases. Channels with high noise floors receive zero power until the budget is large enough for the water to reach them.

Four sub-channels with noise floors

[0.5, 1.0, 2.0, 3.5]

. As

P_{\text{tot}}

increases from 0 to 6, channels activate one by one.

Key Takeaway

Water-filling is the prototypical example of KKT in action. Complementary slackness yields the $(x)^+$ operator: allocate resources only to channels good enough to clear the noise floor. This principle recurs in OFDMA, MIMO spatial multiplexing, and cognitive radio spectrum allocation.

⚠️Engineering Note

Water-Filling in Practice — From Theory to Standards

The elegant water-filling solution assumes perfect channel state information (CSI) at the transmitter. In real systems, several practical constraints modify the solution:

Discrete modulation: Power is not allocated continuously but mapped to discrete MCS (modulation and coding scheme) levels. In LTE/5G NR, each resource block uses one of ~30 MCS indices. The effective allocation is a quantised approximation of water-filling.
Feedback delay: CSI is outdated by the time the transmitter uses it. At vehicular speeds (120 km/h, 3.5 GHz carrier), the coherence time is $\sim 1.4$ ms — comparable to the feedback loop. Equal power allocation can outperform water-filling with stale CSI.
Per-antenna power constraints: Unlike the total power constraint in classic water-filling, practical amplifiers have per-element peak power limits. The resulting optimisation is still convex but no longer admits a closed-form solution.
Computational cost: For $N$ sub-channels, the water-filling algorithm runs in $O(N \log N)$ (sorting-based) or $O(N)$ (bisection on $\mu$ ). Both are negligible compared to channel estimation.

Practical Constraints

•
MCS quantisation: ~30 discrete levels in 5G NR (TS 38.214 Table 5.1.3.1-1)
•
Feedback delay limits CSI freshness; equal power may be preferable at high mobility
•
Per-antenna power limits require iterative solvers instead of closed-form water-filling

📋 Ref: 3GPP TS 38.214 (5G NR Physical Layer Procedures)

Why This Matters: Water-Filling in OFDM Systems

In an OFDM system with $N$ sub-carriers, each sub-carrier is a parallel sub-channel. If the transmitter knows the channel (via feedback), it applies water-filling to allocate power across sub-carriers. In 5G NR, this is approximated by adaptive modulation and coding (AMC) per resource block.

See full treatment in Transmit Diversity

Common Mistake: Equal Power Allocation Is Not Always Optimal

Mistake:

Assuming that dividing power equally across sub-channels ( $p_i = P_{\text{tot}}/N$ ) is optimal.

Correction:

Equal power is optimal only when all sub-channels have the same noise floor. In frequency-selective fading, water-filling significantly outperforms equal allocation, especially at low SNR. At very high SNR, the difference diminishes because all channels are active and the water-level differences become negligible.

Quick Check

In the KKT conditions, complementary slackness states $\lambda_i^\star f_i(\mathbf{x}^\star) = 0$ . If a constraint is strictly inactive ( $f_i(\mathbf{x}^\star) < 0$ ), what can we conclude about $\lambda_i^\star$ ?

$\lambda_i^\star$ can be any non-negative value

$\lambda_i^\star = 0$

$\lambda_i^\star > 0$

$\lambda_i^\star = f_i(\mathbf{x}^\star)$

Correction:

\lambda_i^\star = 0

Correct. Since $f_i(\mathbf{x}^\star) < 0 \neq 0$ and $\lambda_i^\star f_i(\mathbf{x}^\star) = 0$ , we must have $\lambda_i^\star = 0$ . The inactive constraint exerts no force on the solution.

Deeper Treatment in the ITA Book

Water-filling reappears as the capacity-achieving input distribution for parallel Gaussian channels in information theory (§Capacity with Diversity). The ITA book (Chapters 5–6) provides the full derivation from mutual information maximisation, including the continuous-frequency generalisation and the connection to rate-distortion theory. The MIMO book (Chapter 5) extends water-filling to the spatial domain via SVD precoding: the MIMO channel decomposes into parallel singular-value sub-channels, each receiving water-filled power.

KKT Conditions

Karush–Kuhn–Tucker conditions: necessary and sufficient conditions for optimality in convex problems with strong duality. Comprise stationarity, primal feasibility, dual feasibility, and complementary slackness.

Water-Filling

Optimal power allocation across parallel channels: $p_i^\star = (\mu - \sigma_i^2/g_i)^+$ where $\mu$ is the water level determined by the total power constraint.

KKT Conditions and Water-Filling

From Duality to Optimality Conditions

Theorem: Karush–Kuhn–Tucker (KKT) Conditions

Necessity

Complementary slackness

Stationarity

Historical Note: The KKT Conditions

Definition: Water-Filling Problem

Theorem: Water-Filling Solution

Form the Lagrangian

KKT stationarity

Threshold channels

Example: Water-Filling with 4 Sub-Channels

Try all channels active

Check positivity

Remove channel 4

Compute rates

Water-Filling Animation

Parameters

Water-Filling Power Allocation Sweep

Key Takeaway

Water-Filling in Practice — From Theory to Standards

Why This Matters: Water-Filling in OFDM Systems

Common Mistake: Equal Power Allocation Is Not Always Optimal

Quick Check

Deeper Treatment in the ITA Book

KKT Conditions

Water-Filling

Definition:
Water-Filling Problem