Ferkans — Interactive Telecom Tutor

Why Inner Products Are Central to Wireless Communications

At its core, wireless communication is the art of extracting a desired signal from a noisy, fading environment. The inner product is the mathematical tool that quantifies similarity between two signals or vectors, and it appears almost everywhere in the physical layer:

Matched filtering. The optimal detector for a known waveform $s(t)$ in additive white Gaussian noise computes the inner product $\langle r, s \rangle = \int r(t)\,\overline{s(t)}\,dt$ , projecting the received signal onto the transmitted waveform.
Beamforming. A base station with $n_t$ antennas steers energy toward a user by choosing a weight vector $\mathbf{w}$ that maximises $|\mathbf{h}^H \mathbf{w}|$ — an inner product between the channel vector and the beamformer.
Orthogonal waveforms. OFDM, CDMA, and spatial multiplexing all rely on orthogonality ( $\langle \mathbf{x}, \mathbf{y} \rangle = 0$ ) to separate co-existing signals without mutual interference.
Projections and subspace methods. Minimum-mean-square-error (MMSE) estimation, interference cancellation, and subspace-based channel estimation each reduce to an orthogonal projection — the geometric consequence of the inner product.

This section builds the precise machinery: inner products, norms, the Cauchy--Schwarz inequality, orthogonal projections, and the Gram--Schmidt procedure. Every concept will reappear throughout the book.

Definition:
Inner Product on $\mathbb{C}^n$

An inner product on $\mathbb{C}^n$ is a function $\langle \cdot, \cdot \rangle : \mathbb{C}^n \times \mathbb{C}^n \to \mathbb{C}$ satisfying, for all $\mathbf{x}, \mathbf{y}, \mathbf{z} \in \mathbb{C}^n$ and all $\alpha, \beta \in \mathbb{C}$ :

Conjugate symmetry. $\langle \mathbf{x}, \mathbf{y} \rangle = \overline{\langle \mathbf{y}, \mathbf{x} \rangle}$ .
Linearity in the first argument. $\langle \alpha \mathbf{x} + \beta \mathbf{y}, \mathbf{z} \rangle = \alpha \langle \mathbf{x}, \mathbf{z} \rangle + \beta \langle \mathbf{y}, \mathbf{z} \rangle$ .
Positive definiteness. $\langle \mathbf{x}, \mathbf{x} \rangle \geq 0$ , with equality if and only if $\mathbf{x} = \mathbf{0}$ .

The standard (Euclidean) inner product on $\mathbb{C}^n$ is $\langle \mathbf{x}, \mathbf{y} \rangle \;=\; \mathbf{y}^H \mathbf{x} \;=\; \sum_{k=1}^{n} x_k \,\overline{y_k}.$

Convention alert. Axiom 2 makes the inner product linear in the first slot and, by Axiom 1, conjugate-linear (antilinear) in the second: $\langle \mathbf{x}, \alpha \mathbf{y} \rangle = \overline{\alpha}\,\langle \mathbf{x}, \mathbf{y} \rangle.$ Some references (especially in mathematics) adopt the opposite convention — linear in the second argument. Throughout this book we follow the physics/engineering convention stated above, so the standard inner product reads $\mathbf{y}^H \mathbf{x}$ , not $\mathbf{x}^H \mathbf{y}$ . See also ⚠Which Argument Is Conjugate-Linear?.

Definition:
Norm Induced by an Inner Product

Given an inner product space $(\mathbb{C}^n, \langle\cdot,\cdot\rangle)$ , the induced norm (or Euclidean norm) of $\mathbf{x}$ is $\|\mathbf{x}\| \;=\; \sqrt{\langle \mathbf{x}, \mathbf{x} \rangle} \;=\; \sqrt{\sum_{k=1}^{n} |x_k|^2}\,.$ It satisfies the three norm axioms for all $\mathbf{x}, \mathbf{y} \in \mathbb{C}^n$ and $\alpha \in \mathbb{C}$ :

Positive definiteness. $\|\mathbf{x}\| \geq 0$ , with equality iff $\mathbf{x} = \mathbf{0}$ .
Absolute homogeneity. $\|\alpha \mathbf{x}\| = |\alpha|\,\|\mathbf{x}\|$ .
Triangle inequality. $\|\mathbf{x} + \mathbf{y}\| \leq \|\mathbf{x}\| + \|\mathbf{y}\|$ .

The triangle inequality is a consequence of the Cauchy--Schwarz inequality (TCauchy--Schwarz Inequality). We prove this implication after establishing Cauchy--Schwarz.

Definition:
$\ell_p$ Norms

For $1 \leq p < \infty$ the $\ell_p$ norm of $\mathbf{x} \in \mathbb{C}^n$ is $\|\mathbf{x}\|_p \;=\; \Bigl(\sum_{k=1}^{n} |x_k|^p\Bigr)^{1/p}.$ The limiting case $p \to \infty$ gives the $\ell_\infty$ (max) norm: $\|\mathbf{x}\|_\infty \;=\; \max_{1 \leq k \leq n} |x_k|.$

Important special cases:

$p$	Name	Formula
$1$	Manhattan / taxicab norm	$\sum_k \lvert x_k \rvert$
$2$	Euclidean norm	$\bigl(\sum_k \lvert x_k \rvert^2\bigr)^{1/2}$
$\infty$	Chebyshev / max norm	$\max_k \lvert x_k \rvert$

For $0 < p < 1$ the expression above is still well-defined but is not a norm (it violates the triangle inequality); it is sometimes called a quasi-norm and appears in sparse-signal-recovery literature.

Only $p = 2$ yields a norm that is induced by an inner product. The $\ell_1$ norm is heavily used in compressed sensing and LASSO-type regularisation for sparse channel estimation, while $\ell_\infty$ appears in per-antenna power constraints for massive MIMO precoding.

$\ell_p$ Norm Unit Ball in $\mathbb{R}^2$

Explore how the unit ball $\{\mathbf{x} : \|\mathbf{x}\|_p \leq 1\}$ changes shape as $p$ varies from 0.5 to $\infty$ .

Parameters

p

2

Norm order

Show axes

The $\ell_p$ Unit Ball as $p$ Varies

Watch how the shape of the unit ball

\{\mathbf{x} : \|\mathbf{x}\|_p \leq 1\}

morphs as

p

sweeps from

0.3

to

10 \approx \infty

and back.

For

p < 1

the "ball" is non-convex (star-shaped); at

p = 1

it is a diamond; at

p = 2

the familiar circle; and as

p \to \infty

it approaches the square

[-1,1]^2

.

The $\ell_p$ Unit Ball in $\mathbb{R}^3$

The same sweep from

p = 0.3

to

10 \approx \infty

, now in three dimensions with a rotating camera. At

p = 2

the ball is a sphere; at

p = 1

an octahedron; as

p \to \infty

a cube.

The 3D unit ball transitions from a spiky star (

p < 1

) through the octahedron (

p = 1

), sphere (

p = 2

), and finally the cube (

p \to \infty

).

Definition:
Orthogonality

Two vectors $\mathbf{x}, \mathbf{y} \in \mathbb{C}^n$ are orthogonal, written $\mathbf{x} \perp \mathbf{y}$ , if $\langle \mathbf{x}, \mathbf{y} \rangle = 0.$ A set $\{\mathbf{v}_1, \ldots, \mathbf{v}_k\}$ is called an orthogonal set if $\langle \mathbf{v}_i, \mathbf{v}_j \rangle = 0$ for all $i \neq j$ , and an orthonormal set if additionally $\|\mathbf{v}_i\| = 1$ for every $i$ . Compactly: $\langle \mathbf{v}_i, \mathbf{v}_j \rangle = \delta_{ij} \qquad \text{(Kronecker delta)}.$

An orthogonal set of nonzero vectors is automatically linearly independent. Proof: suppose $\sum_i \alpha_i \mathbf{v}_i = \mathbf{0}$ . Taking the inner product with $\mathbf{v}_k$ gives $\alpha_k \|\mathbf{v}_k\|^2 = 0$ , so $\alpha_k = 0$ for every $k$ .

Definition:
Orthogonal Complement

Let $\mathcal{S}$ be a subspace of $\mathbb{C}^n$ . The orthogonal complement of $\mathcal{S}$ is $\mathcal{S}^\perp \;=\; \bigl\{\mathbf{x} \in \mathbb{C}^n : \langle \mathbf{x}, \mathbf{s} \rangle = 0 \;\text{for all } \mathbf{s} \in \mathcal{S}\bigr\}.$ $\mathcal{S}^\perp$ is itself a subspace, and $\mathbb{C}^n = \mathcal{S} \oplus \mathcal{S}^\perp$ (direct sum), meaning every $\mathbf{x} \in \mathbb{C}^n$ can be written uniquely as $\mathbf{x} = \mathbf{x}_{\mathcal{S}} + \mathbf{x}_{\mathcal{S}^\perp}$ with $\mathbf{x}_{\mathcal{S}} \in \mathcal{S}$ and $\mathbf{x}_{\mathcal{S}^\perp} \in \mathcal{S}^\perp$ .

Moreover, $\dim(\mathcal{S}) + \dim(\mathcal{S}^\perp) = n$ and $(\mathcal{S}^\perp)^\perp = \mathcal{S}$ .

In MIMO communications the column space $\mathcal{R}(\mathbf{H})$ of the channel matrix carries the signal, and its orthogonal complement $\mathcal{R}(\mathbf{H})^\perp = \mathcal{N}(\mathbf{H}^H)$ is the "interference-free" subspace used by zero-forcing receivers.

Theorem: Cauchy--Schwarz Inequality

For any $\mathbf{x}, \mathbf{y} \in \mathbb{C}^n$ : $|\langle \mathbf{x}, \mathbf{y} \rangle|^2 \;\leq\; \|\mathbf{x}\|^2 \,\|\mathbf{y}\|^2,$ with equality if and only if $\mathbf{x}$ and $\mathbf{y}$ are linearly dependent, i.e. $\mathbf{x} = \alpha \mathbf{y}$ for some $\alpha \in \mathbb{C}$ , or $\mathbf{y} = \mathbf{0}$ .

The inner product measures the "component" of $\mathbf{x}$ along $\mathbf{y}$ . Cauchy--Schwarz says this component can never exceed the full length of $\mathbf{x}$ — you cannot project more of a vector onto a direction than the vector itself has. Equality holds exactly when $\mathbf{x}$ already lies entirely along $\mathbf{y}$ (or one of them is zero).

In signal-processing language: the output of a correlator $|\langle \mathbf{r}, \mathbf{s} \rangle|$ is bounded by the energies $\|\mathbf{r}\|$ and $\|\mathbf{s}\|$ , and the bound is achieved when the received signal is a scaled copy of the template — the matched-filter condition.

Show Hint

Consider what happens when $\mathbf{y} = \mathbf{0}$ : both sides are zero.

For $\mathbf{y} \neq \mathbf{0}$ , subtract from $\mathbf{x}$ its projection onto $\mathbf{y}$ and examine what remains.

The key idea: the residual $\mathbf{x} - \frac{\langle \mathbf{x}, \mathbf{y} \rangle}{\|\mathbf{y}\|^2}\mathbf{y}$ is orthogonal to $\mathbf{y}$ , and its squared norm must be non-negative.

Proof

Step 1 — Handle the trivial case

If $\mathbf{y} = \mathbf{0}$ , then both sides of the inequality equal zero, so the statement holds with equality.

For the remainder of the proof we assume $\mathbf{y} \neq \mathbf{0}$ , which guarantees $\|\mathbf{y}\|^2 > 0$ .

Step 2 — Construct the optimal residual

Define the scalar $t \;=\; \frac{\langle \mathbf{x}, \mathbf{y} \rangle} {\|\mathbf{y}\|^2} \;\in\; \mathbb{C}$ and the residual vector $\mathbf{r} \;=\; \mathbf{x} - t\,\mathbf{y} \;=\; \mathbf{x} - \frac{\langle \mathbf{x}, \mathbf{y} \rangle} {\|\mathbf{y}\|^2}\,\mathbf{y}.$ Geometrically, $t\,\mathbf{y}$ is the orthogonal projection of $\mathbf{x}$ onto the line spanned by $\mathbf{y}$ , and $\mathbf{r}$ is the component of $\mathbf{x}$ perpendicular to $\mathbf{y}$ .

Step 3 — Verify orthogonality of the residual

We check that $\mathbf{r} \perp \mathbf{y}$ : $\langle \mathbf{r}, \mathbf{y} \rangle \;=\; \langle \mathbf{x} - t\,\mathbf{y},\; \mathbf{y} \rangle \;=\; \langle \mathbf{x}, \mathbf{y} \rangle - t\,\langle \mathbf{y}, \mathbf{y} \rangle \;=\; \langle \mathbf{x}, \mathbf{y} \rangle - \frac{\langle \mathbf{x}, \mathbf{y} \rangle} {\|\mathbf{y}\|^2}\,\|\mathbf{y}\|^2 \;=\; 0.$ (Here we used linearity in the first argument.)

Step 4 — Expand the non-negativity condition

Since $\mathbf{r} \in \mathbb{C}^n$ , positive definiteness of the inner product gives $\|\mathbf{r}\|^2 \geq 0$ . We expand: $\begin{aligned} 0 \;\leq\; \|\mathbf{r}\|^2 &= \langle \mathbf{x} - t\,\mathbf{y},\; \mathbf{x} - t\,\mathbf{y} \rangle \\ &= \|\mathbf{x}\|^2 - t\,\langle \mathbf{y}, \mathbf{x} \rangle - \overline{t}\,\langle \mathbf{x}, \mathbf{y} \rangle + |t|^2\,\|\mathbf{y}\|^2. \end{aligned}$ Now substitute $t = \langle \mathbf{x}, \mathbf{y}\rangle / \|\mathbf{y}\|^2$ and note that $\overline{t} = \overline{\langle \mathbf{x}, \mathbf{y}\rangle} / \|\mathbf{y}\|^2 = \langle \mathbf{y}, \mathbf{x}\rangle / \|\mathbf{y}\|^2$ . Then: $\begin{aligned} 0 &\leq \|\mathbf{x}\|^2 - \frac{\langle \mathbf{x}, \mathbf{y}\rangle \langle \mathbf{y}, \mathbf{x}\rangle} {\|\mathbf{y}\|^2} - \frac{\langle \mathbf{y}, \mathbf{x}\rangle \langle \mathbf{x}, \mathbf{y}\rangle} {\|\mathbf{y}\|^2} + \frac{|\langle \mathbf{x}, \mathbf{y}\rangle|^2} {\|\mathbf{y}\|^4}\,\|\mathbf{y}\|^2 \\[6pt] &= \|\mathbf{x}\|^2 - \frac{2\,|\langle \mathbf{x}, \mathbf{y}\rangle|^2} {\|\mathbf{y}\|^2} + \frac{|\langle \mathbf{x}, \mathbf{y}\rangle|^2} {\|\mathbf{y}\|^2} \\[6pt] &= \|\mathbf{x}\|^2 - \frac{|\langle \mathbf{x}, \mathbf{y}\rangle|^2} {\|\mathbf{y}\|^2}. \end{aligned}$ (We used $\langle \mathbf{x}, \mathbf{y}\rangle \langle \mathbf{y}, \mathbf{x}\rangle = \langle \mathbf{x}, \mathbf{y}\rangle\, \overline{\langle \mathbf{x}, \mathbf{y}\rangle} = |\langle \mathbf{x}, \mathbf{y}\rangle|^2$ .)

Step 5 — Rearrange to obtain the inequality

Multiplying both sides of $0 \;\leq\; \|\mathbf{x}\|^2 - \frac{|\langle \mathbf{x}, \mathbf{y}\rangle|^2} {\|\mathbf{y}\|^2}$ by $\|\mathbf{y}\|^2 > 0$ yields the Cauchy--Schwarz inequality: $|\langle \mathbf{x}, \mathbf{y}\rangle|^2 \;\leq\; \|\mathbf{x}\|^2\,\|\mathbf{y}\|^2. \qquad \blacksquare$

Equality condition

Equality holds if and only if $\|\mathbf{r}\|^2 = 0$ , i.e.
$\mathbf{r} = \mathbf{0}$ , which means $\mathbf{x} = t\,\mathbf{y} = \frac{\langle \mathbf{x}, \mathbf{y}\rangle} {\|\mathbf{y}\|^2}\,\mathbf{y},$ so $\mathbf{x}$ is a scalar multiple of $\mathbf{y}$ .

Conversely, if $\mathbf{x} = \alpha \mathbf{y}$ for some $\alpha \in \mathbb{C}$ , then $|\langle \mathbf{x}, \mathbf{y}\rangle|^2 = |\alpha|^2 \|\mathbf{y}\|^4 = \|\mathbf{x}\|^2 \|\mathbf{y}\|^2$ , confirming equality. $\blacksquare$

Theorem: Triangle Inequality for the Euclidean Norm

For any $\mathbf{x}, \mathbf{y} \in \mathbb{C}^n$ , $\|\mathbf{x} + \mathbf{y}\| \;\leq\; \|\mathbf{x}\| + \|\mathbf{y}\|.$

Proof

Proof via Cauchy--Schwarz

Expand the squared norm: $\begin{aligned} \|\mathbf{x} + \mathbf{y}\|^2 &= \langle \mathbf{x}+\mathbf{y},\;\mathbf{x}+\mathbf{y}\rangle \\ &= \|\mathbf{x}\|^2 + \langle \mathbf{y}, \mathbf{x}\rangle + \langle \mathbf{x}, \mathbf{y}\rangle + \|\mathbf{y}\|^2 \\ &= \|\mathbf{x}\|^2 + 2\,\operatorname{Re}\!\bigl(\langle \mathbf{x}, \mathbf{y}\rangle\bigr) + \|\mathbf{y}\|^2. \end{aligned}$ Since $\operatorname{Re}(z) \leq |z|$ and $|\langle \mathbf{x}, \mathbf{y}\rangle| \leq \|\mathbf{x}\|\,\|\mathbf{y}\|$ by Cauchy--Schwarz, $\|\mathbf{x}+\mathbf{y}\|^2 \;\leq\; \|\mathbf{x}\|^2 + 2\|\mathbf{x}\|\,\|\mathbf{y}\| + \|\mathbf{y}\|^2 \;=\; \bigl(\|\mathbf{x}\|+\|\mathbf{y}\|\bigr)^2.$ Taking square roots (both sides are non-negative) completes the proof. $\blacksquare$

Theorem: Orthogonal Projection Theorem

Let $\mathcal{S}$ be a closed subspace of $\mathbb{C}^n$ (every subspace of a finite-dimensional space is closed). For any $\mathbf{x} \in \mathbb{C}^n$ , there exists a unique $\hat{\mathbf{x}} \in \mathcal{S}$ that minimises the distance from $\mathbf{x}$ to $\mathcal{S}$ : $\hat{\mathbf{x}} \;=\; \arg\min_{\mathbf{s} \in \mathcal{S}} \|\mathbf{x} - \mathbf{s}\|.$ This minimiser is characterised by the orthogonality condition $\mathbf{x} - \hat{\mathbf{x}} \;\perp\; \mathcal{S} \qquad\Longleftrightarrow\qquad \langle \mathbf{x} - \hat{\mathbf{x}},\; \mathbf{s} \rangle = 0 \quad \forall\, \mathbf{s} \in \mathcal{S}.$ If $\{\mathbf{u}_1, \ldots, \mathbf{u}_k\}$ is an orthonormal basis for $\mathcal{S}$ , the projection is given explicitly by $\hat{\mathbf{x}} \;=\; \sum_{i=1}^{k} \langle \mathbf{x}, \mathbf{u}_i \rangle\,\mathbf{u}_i \;=\; \mathbf{U}\mathbf{U}^H \mathbf{x},$ where $\mathbf{U} = [\mathbf{u}_1 \;\cdots\; \mathbf{u}_k] \in \mathbb{C}^{n \times k}$ .

Proof

Step 1 — Existence and the orthogonality condition

Let $\hat{\mathbf{x}} \in \mathcal{S}$ be any minimiser of $f(\mathbf{s}) = \|\mathbf{x} - \mathbf{s}\|^2$ over $\mathcal{S}$ . (Existence in $\mathbb{C}^n$ is guaranteed because $\mathcal{S}$ is a finite-dimensional, hence closed, subspace.)

For any $\mathbf{s} \in \mathcal{S}$ and any $\epsilon \in \mathbb{R}$ , the vector $\hat{\mathbf{x}} + \epsilon\,\mathbf{s} \in \mathcal{S}$ . By optimality of $\hat{\mathbf{x}}$ : $\begin{aligned} 0 &\leq \|\mathbf{x} - \hat{\mathbf{x}} - \epsilon\,\mathbf{s}\|^2 - \|\mathbf{x} - \hat{\mathbf{x}}\|^2 \\ &= -2\epsilon\,\operatorname{Re}\!\bigl( \langle \mathbf{x} - \hat{\mathbf{x}},\;\mathbf{s}\rangle \bigr) + \epsilon^2 \|\mathbf{s}\|^2. \end{aligned}$ Dividing by $\epsilon > 0$ and letting $\epsilon \downarrow 0$ gives $\operatorname{Re}\!\bigl( \langle \mathbf{x} - \hat{\mathbf{x}}, \mathbf{s}\rangle \bigr) \leq 0$ . Repeating with $\epsilon < 0$ gives the reverse inequality, so $\operatorname{Re}\!\bigl( \langle \mathbf{x} - \hat{\mathbf{x}}, \mathbf{s}\rangle \bigr) = 0$ . Replacing $\mathbf{s}$ by $j\,\mathbf{s}$ (still in $\mathcal{S}$ ) shows the imaginary part also vanishes, hence $\langle \mathbf{x} - \hat{\mathbf{x}}, \mathbf{s}\rangle = 0$ for all $\mathbf{s} \in \mathcal{S}$ .

Step 2 — Uniqueness

Suppose $\hat{\mathbf{x}}_1$ and $\hat{\mathbf{x}}_2$ both satisfy the orthogonality condition. Then $\mathbf{d} = \hat{\mathbf{x}}_1 - \hat{\mathbf{x}}_2 \in \mathcal{S}$ , and $\langle \mathbf{x} - \hat{\mathbf{x}}_1,\;\mathbf{d}\rangle = 0, \qquad \langle \mathbf{x} - \hat{\mathbf{x}}_2,\;\mathbf{d}\rangle = 0.$ Subtracting: $\langle \hat{\mathbf{x}}_2 - \hat{\mathbf{x}}_1,\;\mathbf{d}\rangle = 0$ , i.e. $\langle -\mathbf{d}, \mathbf{d}\rangle = 0$ , so $\|\mathbf{d}\|^2 = 0$ and $\mathbf{d} = \mathbf{0}$ .

Step 3 — Explicit formula via an orthonormal basis

Let $\{\mathbf{u}_1, \ldots, \mathbf{u}_k\}$ be an orthonormal basis for $\mathcal{S}$ and write $\hat{\mathbf{x}} = \sum_{i=1}^{k} c_i \mathbf{u}_i$ . The orthogonality condition $\langle \mathbf{x} - \hat{\mathbf{x}}, \mathbf{u}_m \rangle = 0$ gives $c_m = \langle \mathbf{x}, \mathbf{u}_m \rangle, \qquad m = 1, \ldots, k.$ Hence $\hat{\mathbf{x}} = \sum_{i=1}^{k} \langle \mathbf{x}, \mathbf{u}_i\rangle\,\mathbf{u}_i = \mathbf{U}\mathbf{U}^H \mathbf{x}$ . $\blacksquare$

Theorem: Pythagorean Theorem

If $\mathbf{x} \perp \mathbf{y}$ in $\mathbb{C}^n$ , then $\|\mathbf{x} + \mathbf{y}\|^2 = \|\mathbf{x}\|^2 + \|\mathbf{y}\|^2.$ More generally, for mutually orthogonal vectors $\mathbf{x}_1, \ldots, \mathbf{x}_m$ : $\Bigl\|\sum_{i=1}^{m} \mathbf{x}_i\Bigr\|^2 = \sum_{i=1}^{m} \|\mathbf{x}_i\|^2.$

Proof

Expand: $\|\mathbf{x}+\mathbf{y}\|^2 = \|\mathbf{x}\|^2 + 2\operatorname{Re}\!\bigl(\langle\mathbf{x},\mathbf{y}\rangle\bigr) + \|\mathbf{y}\|^2 = \|\mathbf{x}\|^2 + \|\mathbf{y}\|^2,$ since $\langle \mathbf{x}, \mathbf{y}\rangle = 0$ implies $\operatorname{Re}(\langle\mathbf{x},\mathbf{y}\rangle) = 0$ . The general case follows by induction. $\blacksquare$

Classical Gram--Schmidt Orthogonalization

Complexity:

O(nk^2)

Input: Linearly independent vectors

\mathbf{v}_1, \ldots, \mathbf{v}_k \in \mathbb{C}^n

.

Output: Orthonormal vectors

\mathbf{u}_1, \ldots, \mathbf{u}_k

spanning the same subspace.

1.

\mathbf{u}_1 \leftarrow \mathbf{v}_1 / \|\mathbf{v}_1\|

2. for

i = 2, \ldots, k

do

3.

\qquad \mathbf{w}_i \leftarrow \mathbf{v}_i - \displaystyle\sum_{m=1}^{i-1} \langle \mathbf{v}_i, \mathbf{u}_m \rangle\, \mathbf{u}_m

\qquad

(subtract projections onto all previous directions)

4.

\qquad \mathbf{u}_i \leftarrow \mathbf{w}_i / \|\mathbf{w}_i\|

\qquad

(normalise)

5. end for

6. return

\mathbf{u}_1, \ldots, \mathbf{u}_k

Numerical stability. Classical Gram--Schmidt (CGS) suffers from catastrophic cancellation in floating-point arithmetic: the computed vectors lose orthogonality as rounding errors accumulate. Modified Gram--Schmidt (MGS) is algebraically identical but numerically superior. In MGS, the projections in line 3 are subtracted one at a time, updating $\mathbf{w}_i$ after each subtraction rather than computing all projections from the original $\mathbf{v}_i$ . For production code (e.g.
QR factorisation in MATLAB/NumPy), Householder reflections or Givens rotations are preferred.

,

Gram--Schmidt Orthogonalization Step by Step

Watch the Gram--Schmidt process build an orthonormal basis from two vectors: normalize the first, project the second, subtract the projection, normalize. The right angle at the end confirms orthogonality.

The key insight: each new basis vector is obtained by removing the components along all previously computed basis vectors, then normalizing.

Example: Gram--Schmidt on Three Vectors in $\mathbb{C}^3$

Apply the Gram--Schmidt procedure to the vectors $\mathbf{v}_1 = \begin{pmatrix}1\\1\\0\end{pmatrix},\qquad \mathbf{v}_2 = \begin{pmatrix}1\\0\\1\end{pmatrix},\qquad \mathbf{v}_3 = \begin{pmatrix}0\\1\\1\end{pmatrix}$ to obtain an orthonormal basis for $\mathbb{C}^3$ .

Solution

Step 1 — First orthonormal vector

Normalise $\mathbf{v}_1$ : $\|\mathbf{v}_1\| = \sqrt{1^2 + 1^2 + 0^2} = \sqrt{2}, \qquad \mathbf{u}_1 = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\\0\end{pmatrix}.$

Step 2 — Second orthonormal vector

Compute the projection of $\mathbf{v}_2$ onto $\mathbf{u}_1$ : $\langle \mathbf{v}_2, \mathbf{u}_1 \rangle = \mathbf{u}_1^H \mathbf{v}_2 = \frac{1}{\sqrt{2}}(1 \cdot 1 + 1 \cdot 0 + 0 \cdot 1) = \frac{1}{\sqrt{2}}.$ Subtract: $\mathbf{w}_2 = \mathbf{v}_2 - \langle \mathbf{v}_2, \mathbf{u}_1\rangle\,\mathbf{u}_1 = \begin{pmatrix}1\\0\\1\end{pmatrix} - \frac{1}{\sqrt{2}} \cdot \frac{1}{\sqrt{2}} \begin{pmatrix}1\\1\\0\end{pmatrix} = \begin{pmatrix}1\\0\\1\end{pmatrix} - \begin{pmatrix}1/2\\1/2\\0\end{pmatrix} = \begin{pmatrix}1/2\\-1/2\\1\end{pmatrix}.$ Normalise: $\|\mathbf{w}_2\| = \sqrt{\tfrac{1}{4} + \tfrac{1}{4} + 1} = \sqrt{\tfrac{3}{2}} = \frac{\sqrt{6}}{2}, \qquad \mathbf{u}_2 = \frac{2}{\sqrt{6}} \begin{pmatrix}1/2\\-1/2\\1\end{pmatrix} = \frac{1}{\sqrt{6}} \begin{pmatrix}1\\-1\\2\end{pmatrix}.$

Step 3 — Third orthonormal vector

Compute projections of $\mathbf{v}_3$ onto $\mathbf{u}_1$ and $\mathbf{u}_2$ : $\langle \mathbf{v}_3, \mathbf{u}_1 \rangle = \frac{1}{\sqrt{2}}(0 + 1 + 0) = \frac{1}{\sqrt{2}},$ $\langle \mathbf{v}_3, \mathbf{u}_2 \rangle = \frac{1}{\sqrt{6}}(0 - 1 + 2) = \frac{1}{\sqrt{6}}.$ Subtract: $\mathbf{w}_3 = \begin{pmatrix}0\\1\\1\end{pmatrix} - \frac{1}{\sqrt{2}} \cdot \frac{1}{\sqrt{2}} \begin{pmatrix}1\\1\\0\end{pmatrix} - \frac{1}{\sqrt{6}} \cdot \frac{1}{\sqrt{6}} \begin{pmatrix}1\\-1\\2\end{pmatrix} = \begin{pmatrix}0\\1\\1\end{pmatrix} - \begin{pmatrix}1/2\\1/2\\0\end{pmatrix} - \begin{pmatrix}1/6\\-1/6\\1/3\end{pmatrix} = \begin{pmatrix}-2/3\\2/3\\2/3\end{pmatrix}.$ Normalise: $\|\mathbf{w}_3\| = \sqrt{\tfrac{4}{9}+\tfrac{4}{9}+\tfrac{4}{9}} = \frac{2\sqrt{3}}{3} = \frac{2}{\sqrt{3}}, \qquad \mathbf{u}_3 = \frac{\sqrt{3}}{2} \begin{pmatrix}-2/3\\2/3\\2/3\end{pmatrix} = \frac{1}{\sqrt{3}} \begin{pmatrix}-1\\1\\1\end{pmatrix}.$

Step 4 — Verification

We verify orthonormality by direct computation: $\langle \mathbf{u}_1, \mathbf{u}_2 \rangle = \frac{1}{\sqrt{2}\sqrt{6}}(1 \cdot 1 + 1 \cdot (-1) + 0 \cdot 2) = 0, \quad\checkmark$ $\langle \mathbf{u}_1, \mathbf{u}_3 \rangle = \frac{1}{\sqrt{2}\sqrt{3}}(1 \cdot (-1) + 1 \cdot 1 + 0 \cdot 1) = 0, \quad\checkmark$ $\langle \mathbf{u}_2, \mathbf{u}_3 \rangle = \frac{1}{\sqrt{6}\sqrt{3}}(1 \cdot (-1) + (-1) \cdot 1 + 2 \cdot 1) = 0. \quad\checkmark$ Each vector has unit norm by construction. The orthonormal basis is $\mathbf{u}_1 = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\\0\end{pmatrix},\quad \mathbf{u}_2 = \frac{1}{\sqrt{6}}\begin{pmatrix}1\\-1\\2\end{pmatrix},\quad \mathbf{u}_3 = \frac{1}{\sqrt{3}}\begin{pmatrix}-1\\1\\1\end{pmatrix}.$

Historical Note: The Many Names of the Cauchy--Schwarz Inequality

19th century

Few results in mathematics have been independently discovered — and named — as often as this one.

Augustin-Louis Cauchy (1821) proved the discrete inequality for real sequences in his Cours d'analyse. Viktor Bunyakovsky (1859) extended it to integrals, leading some authors (particularly in the Russian tradition) to call it the Cauchy--Bunyakovsky inequality. Hermann Amandus Schwarz (1885) independently proved the integral version with full rigour.

The inequality is therefore variously known as:

Cauchy--Schwarz (most common in Western engineering literature),
Cauchy--Bunyakovsky--Schwarz (CBS) (common in mathematical analysis),
Schwarz inequality (in some functional-analysis texts).

The proof technique we presented — subtracting the projection and exploiting non-negativity of the squared norm — is essentially the one Schwarz used, and it generalises verbatim to arbitrary inner product spaces, including $L^2$ spaces of square-integrable functions fundamental to signal processing.

Common Mistake: Which Argument Is Conjugate-Linear?

Mistake:

Writing $\langle \mathbf{x}, \alpha\mathbf{y} \rangle = \alpha \langle \mathbf{x}, \mathbf{y} \rangle$ — treating the inner product as linear in both arguments. This leads to sign and phase errors in every derivation that involves complex scalars.

Correction:

Under our convention (linear in the first argument, following the notation table in NNotation for This Chapter): $\langle \mathbf{x}, \alpha\mathbf{y} \rangle = \overline{\alpha}\,\langle \mathbf{x}, \mathbf{y} \rangle \qquad \text{(conjugate-linear in the second argument)}.$ Always check which convention a reference uses before borrowing a formula. In particular, the projection formula becomes $\hat{\mathbf{x}} = \frac{\langle \mathbf{x}, \mathbf{u}\rangle} {\langle \mathbf{u}, \mathbf{u}\rangle}\,\mathbf{u}$ (numerator: the vector being projected goes in the first slot).

A quick sanity check: $\langle \mathbf{x}, \mathbf{x} \rangle$ must be a real non-negative number. If your calculation yields a complex value, you have mixed up the convention.

Common Mistake: Classical Gram--Schmidt Loses Orthogonality

Mistake:

Implementing Gram--Schmidt exactly as written in AClassical Gram--Schmidt Orthogonalization in floating-point arithmetic and trusting that the output vectors are orthogonal to machine precision.

Correction:

Classical Gram--Schmidt (CGS) is numerically unstable: rounding errors cause the computed $\mathbf{u}_i$ to drift away from orthogonality, often severely when the input vectors are nearly dependent.

Modified Gram--Schmidt (MGS) reorders the computation: instead of projecting $\mathbf{v}_i$ onto all previous $\mathbf{u}_m$ simultaneously, MGS subtracts each projection sequentially, updating the working vector after each subtraction. This yields the same result in exact arithmetic but reduces error propagation in finite precision.

For high-reliability implementations (QR decomposition, MIMO detection), prefer Householder reflections or library routines (numpy.linalg.qr, MATLAB qr), which are backward stable.

⚠️Engineering Note

Orthogonalization in Production: QR Factorization

In production signal processing code, never implement Gram-Schmidt manually. Use QR factorization from a numerical linear algebra library:

Python: Q, R = numpy.linalg.qr(A) (Householder-based, backward stable)
MATLAB: [Q, R] = qr(A) (same algorithm)
C/Fortran: LAPACK dgeqrf / zgeqrf

Householder QR costs $2mn^2 - 2n^3/3$ flops for an $m \times n$ matrix — the same order as Modified Gram-Schmidt but with guaranteed backward stability. In MIMO detection, the QR decomposition of the channel matrix $\mathbf{H} = \mathbf{Q}\mathbf{R}$ enables efficient successive interference cancellation (SIC) by back-substitution on the upper-triangular $\mathbf{R}$ .

Practical Constraints

•
Classical Gram-Schmidt: loss of orthogonality proportional to $\kappa(\mathbf{A})$
•
Modified Gram-Schmidt: loss proportional to $\kappa(\mathbf{A}) \cdot \varepsilon_{\text{mach}}$
•
Householder QR: backward stable — orthogonality loss bounded by $O(n \cdot \varepsilon_{\text{mach}})$ regardless of $\kappa$

Why This Matters: From Inner Products to Beamforming

When a base station with $n_t$ antennas transmits signal $s$ using beamforming vector $\mathbf{w} \in \mathbb{C}^{n_t}$ , the received signal at a single-antenna user is $y \;=\; \mathbf{h}^H \mathbf{w}\,s \;+\; n,$ where $\mathbf{h} \in \mathbb{C}^{n_t}$ is the channel vector and $n$ is additive noise.

The effective channel gain is $|\mathbf{h}^H \mathbf{w}|$ , which is the modulus of an inner product. Maximising this gain subject to the unit power constraint $\|\mathbf{w}\| = 1$ is a direct application of Cauchy--Schwarz: $|\mathbf{h}^H \mathbf{w}| \;\leq\; \|\mathbf{h}\|\,\|\mathbf{w}\| \;=\; \|\mathbf{h}\|,$ with equality when $\mathbf{w} = e^{j\theta}\mathbf{h}/\|\mathbf{h}\|$ for any phase $\theta$ . The optimal beamformer is therefore the matched filter (or maximum-ratio transmission) beamformer $\mathbf{w}^\star = \frac{\mathbf{h}}{\|\mathbf{h}\|}.$ This result generalises to multi-user settings (zero-forcing beamforming uses projections) and to receive combining (maximum-ratio combining).

See full treatment in Precoding with CSIT

Why This Matters: Orthogonal Projection as MMSE Estimation

The orthogonal projection theorem (TOrthogonal Projection Theorem) is the geometric backbone of linear minimum-mean-square-error (LMMSE) estimation.

Suppose we observe $\mathbf{y} = \mathbf{H}\mathbf{x} + \mathbf{n}$ and wish to estimate $\mathbf{x}$ by a linear function $\hat{\mathbf{x}} = \mathbf{W}\mathbf{y}$ . Minimising the MSE $E[\|\mathbf{x} - \hat{\mathbf{x}}\|^2]$ is equivalent to requiring the estimation error $\mathbf{x} - \hat{\mathbf{x}}$ to be orthogonal (in the stochastic inner product $\langle \mathbf{a}, \mathbf{b}\rangle = E[\mathbf{b}^H \mathbf{a}]$ ) to the observation space spanned by $\mathbf{y}$ — precisely the orthogonality principle. The solution is $\mathbf{W}_{\mathrm{MMSE}} = \mathbf{R}_{xy}\mathbf{R}_{yy}^{-1},$ where $\mathbf{R}_{xy} = E[\mathbf{x}\mathbf{y}^H]$ and $\mathbf{R}_{yy} = E[\mathbf{y}\mathbf{y}^H]$ .

Every LMMSE channel estimator and equaliser in this book traces back to this projection.

See full treatment in Estimation Theory Fundamentals

Quick Check

Let $\mathbf{x} = \begin{pmatrix}1\\j\end{pmatrix}$ and $\mathbf{y} = \begin{pmatrix}2\\2j\end{pmatrix}$ in $\mathbb{C}^2$ . Does the Cauchy--Schwarz inequality hold with equality?

Yes, because $\mathbf{x} = \tfrac{1}{2}\mathbf{y}$ .

No, strict inequality holds.

It depends on the choice of inner product convention.

Cannot be determined without computing.

Correction:

Yes, because

\mathbf{x} = \tfrac{1}{2}\mathbf{y}

.

$\mathbf{y} = 2\mathbf{x}$ , so $\mathbf{x} = \frac{1}{2}\mathbf{y}$ . The vectors are linearly dependent and the equality condition is satisfied. Indeed: $|\langle \mathbf{x}, \mathbf{y}\rangle|^2 = |2 + 2|^2 = 16$ and $\|\mathbf{x}\|^2 \|\mathbf{y}\|^2 = 2 \cdot 8 = 16$ .

Quick Check

Under the convention used in this book (linear in the first argument), what is $\langle j\mathbf{x}, \mathbf{x} \rangle$ for a nonzero $\mathbf{x} \in \mathbb{C}^n$ ?

$j\|\mathbf{x}\|^2$

$-j\|\mathbf{x}\|^2$

$\|\mathbf{x}\|^2$

$-\|\mathbf{x}\|^2$

Correction:

j\|\mathbf{x}\|^2

By linearity in the first argument: $\langle j\mathbf{x}, \mathbf{x}\rangle = j\,\langle \mathbf{x}, \mathbf{x}\rangle = j\,\|\mathbf{x}\|^2$ .

Inner product

A function $\langle\cdot,\cdot\rangle: V \times V \to \mathbb{C}$ satisfying conjugate symmetry, linearity in (at least) one argument, and positive definiteness. Equips a vector space with geometric notions of length, angle, and orthogonality.

Norm

A function $\|\cdot\|: V \to [0, \infty)$ satisfying positive definiteness, absolute homogeneity, and the triangle inequality. The Euclidean norm $\|\mathbf{x}\| = \sqrt{\langle\mathbf{x}, \mathbf{x}\rangle}$ is the norm induced by the standard inner product.

Orthogonal projection

The unique closest point in a subspace $\mathcal{S}$ to a given vector $\mathbf{x}$ . Characterised by the condition that the residual $\mathbf{x} - \hat{\mathbf{x}}$ is orthogonal to every vector in $\mathcal{S}$ . Computed via $\hat{\mathbf{x}} = \mathbf{U}\mathbf{U}^H\mathbf{x}$ when $\mathbf{U}$ has orthonormal columns spanning $\mathcal{S}$ .

Key Takeaway

The inner product $\langle \mathbf{x}, \mathbf{y}\rangle = \mathbf{y}^H \mathbf{x}$ is the fundamental tool for measuring similarity, length, and angle in $\mathbb{C}^n$ . Our convention: linear in the first argument, conjugate-linear in the second.
Cauchy--Schwarz bounds the inner product by the product of norms and is the workhorse inequality of linear algebra. Its equality condition ( $\mathbf{x} \propto \mathbf{y}$ ) directly gives the matched-filter / maximum-ratio-transmission beamformer.
Orthogonal projection onto a subspace is the unique best approximation in the least-squares sense. The orthogonality condition (residual $\perp$ subspace) underlies MMSE estimation, interference nulling, and subspace signal processing.
Gram--Schmidt converts any linearly independent set into an orthonormal basis. Use the modified variant or Householder reflections in numerical implementations.
The $\ell_p$ norm family ( $p = 1, 2, \infty$ ) appears throughout communications: $\ell_2$ for energy, $\ell_1$ for sparsity promotion, and $\ell_\infty$ for per-antenna power constraints.

Inner Products and Norms

Why Inner Products Are Central to Wireless Communications

Definition: Inner Product on Cn\mathbb{C}^nCn

Definition: Norm Induced by an Inner Product

Definition: ℓp\ell_pℓp​ Norms

ℓp\ell_pℓp​ Norm Unit Ball in R2\mathbb{R}^2R2

Parameters

The ℓp\ell_pℓp​ Unit Ball as ppp Varies

The ℓp\ell_pℓp​ Unit Ball in R3\mathbb{R}^3R3

Definition: Orthogonality

Definition: Orthogonal Complement

Theorem: Cauchy--Schwarz Inequality

Step 1 — Handle the trivial case

Step 2 — Construct the optimal residual

Step 3 — Verify orthogonality of the residual

Step 4 — Expand the non-negativity condition

Step 5 — Rearrange to obtain the inequality

Equality condition

Theorem: Triangle Inequality for the Euclidean Norm

Proof via Cauchy--Schwarz

Theorem: Orthogonal Projection Theorem

Step 1 — Existence and the orthogonality condition

Step 2 — Uniqueness

Step 3 — Explicit formula via an orthonormal basis

Theorem: Pythagorean Theorem

Proof

Classical Gram--Schmidt Orthogonalization

Gram--Schmidt Orthogonalization Step by Step

Example: Gram--Schmidt on Three Vectors in C3\mathbb{C}^3C3

Step 1 — First orthonormal vector

Step 2 — Second orthonormal vector

Step 3 — Third orthonormal vector

Step 4 — Verification

Historical Note: The Many Names of the Cauchy--Schwarz Inequality

Common Mistake: Which Argument Is Conjugate-Linear?

Common Mistake: Classical Gram--Schmidt Loses Orthogonality

Orthogonalization in Production: QR Factorization

Why This Matters: From Inner Products to Beamforming

Why This Matters: Orthogonal Projection as MMSE Estimation

Quick Check

Quick Check

Inner product

Norm

Orthogonal projection

Key Takeaway

Definition:
Inner Product on $\mathbb{C}^n$

Definition:
Norm Induced by an Inner Product

Definition:
$\ell_p$ Norms

$\ell_p$ Norm Unit Ball in $\mathbb{R}^2$

The $\ell_p$ Unit Ball as $p$ Varies

The $\ell_p$ Unit Ball in $\mathbb{R}^3$

Definition:
Orthogonality

Definition:
Orthogonal Complement

Example: Gram--Schmidt on Three Vectors in $\mathbb{C}^3$