Inner Products and Norms
Why Inner Products Are Central to Wireless Communications
At its core, wireless communication is the art of extracting a desired signal from a noisy, fading environment. The inner product is the mathematical tool that quantifies similarity between two signals or vectors, and it appears almost everywhere in the physical layer:
-
Matched filtering. The optimal detector for a known waveform in additive white Gaussian noise computes the inner product , projecting the received signal onto the transmitted waveform.
-
Beamforming. A base station with antennas steers energy toward a user by choosing a weight vector that maximises β an inner product between the channel vector and the beamformer.
-
Orthogonal waveforms. OFDM, CDMA, and spatial multiplexing all rely on orthogonality () to separate co-existing signals without mutual interference.
-
Projections and subspace methods. Minimum-mean-square-error (MMSE) estimation, interference cancellation, and subspace-based channel estimation each reduce to an orthogonal projection β the geometric consequence of the inner product.
This section builds the precise machinery: inner products, norms, the Cauchy--Schwarz inequality, orthogonal projections, and the Gram--Schmidt procedure. Every concept will reappear throughout the book.
Definition: Inner Product on
Inner Product on
An inner product on is a function satisfying, for all and all :
-
Conjugate symmetry. .
-
Linearity in the first argument. .
-
Positive definiteness. , with equality if and only if .
The standard (Euclidean) inner product on is
Convention alert. Axiom 2 makes the inner product linear in the first slot and, by Axiom 1, conjugate-linear (antilinear) in the second: Some references (especially in mathematics) adopt the opposite convention β linear in the second argument. Throughout this book we follow the physics/engineering convention stated above, so the standard inner product reads , not . See also β Which Argument Is Conjugate-Linear?.
Definition: Norm Induced by an Inner Product
Norm Induced by an Inner Product
Given an inner product space , the induced norm (or Euclidean norm) of is It satisfies the three norm axioms for all and :
- Positive definiteness. , with equality iff .
- Absolute homogeneity. .
- Triangle inequality. .
The triangle inequality is a consequence of the Cauchy--Schwarz inequality (TCauchy--Schwarz Inequality). We prove this implication after establishing Cauchy--Schwarz.
Definition: Norms
Norms
For the norm of is The limiting case gives the (max) norm:
Important special cases:
| Name | Formula | |
|---|---|---|
| Manhattan / taxicab norm | ||
| Euclidean norm | ||
| Chebyshev / max norm |
For the expression above is still well-defined but is not a norm (it violates the triangle inequality); it is sometimes called a quasi-norm and appears in sparse-signal-recovery literature.
Only yields a norm that is induced by an inner product. The norm is heavily used in compressed sensing and LASSO-type regularisation for sparse channel estimation, while appears in per-antenna power constraints for massive MIMO precoding.
Norm Unit Ball in
Explore how the unit ball changes shape as varies from 0.5 to .
Parameters
Norm order
The Unit Ball as Varies
The Unit Ball in
Definition: Orthogonality
Orthogonality
Two vectors are orthogonal, written , if A set is called an orthogonal set if for all , and an orthonormal set if additionally for every . Compactly:
An orthogonal set of nonzero vectors is automatically linearly independent. Proof: suppose . Taking the inner product with gives , so for every .
Definition: Orthogonal Complement
Orthogonal Complement
Let be a subspace of . The orthogonal complement of is is itself a subspace, and (direct sum), meaning every can be written uniquely as with and .
Moreover, and .
In MIMO communications the column space of the channel matrix carries the signal, and its orthogonal complement is the "interference-free" subspace used by zero-forcing receivers.
Theorem: Cauchy--Schwarz Inequality
For any : with equality if and only if and are linearly dependent, i.e. for some , or .
The inner product measures the "component" of along . Cauchy--Schwarz says this component can never exceed the full length of β you cannot project more of a vector onto a direction than the vector itself has. Equality holds exactly when already lies entirely along (or one of them is zero).
In signal-processing language: the output of a correlator is bounded by the energies and , and the bound is achieved when the received signal is a scaled copy of the template β the matched-filter condition.
Consider what happens when : both sides are zero.
For , subtract from its projection onto and examine what remains.
The key idea: the residual is orthogonal to , and its squared norm must be non-negative.
Step 1 β Handle the trivial case
If , then both sides of the inequality equal zero, so the statement holds with equality.
For the remainder of the proof we assume , which guarantees .
Step 2 β Construct the optimal residual
Define the scalar and the residual vector Geometrically, is the orthogonal projection of onto the line spanned by , and is the component of perpendicular to .
Step 3 β Verify orthogonality of the residual
We check that : (Here we used linearity in the first argument.)
Step 4 β Expand the non-negativity condition
Since , positive definiteness of the inner product gives . We expand: Now substitute and note that . Then: (We used .)
Step 5 β Rearrange to obtain the inequality
Multiplying both sides of by yields the Cauchy--Schwarz inequality:
Equality condition
Equality holds if and only if , i.e.
, which means
so is a scalar multiple of .
Conversely, if for some , then , confirming equality.
Theorem: Triangle Inequality for the Euclidean Norm
For any ,
Proof via Cauchy--Schwarz
Expand the squared norm: Since and by Cauchy--Schwarz, Taking square roots (both sides are non-negative) completes the proof.
Theorem: Orthogonal Projection Theorem
Let be a closed subspace of (every subspace of a finite-dimensional space is closed). For any , there exists a unique that minimises the distance from to : This minimiser is characterised by the orthogonality condition If is an orthonormal basis for , the projection is given explicitly by where .
Step 1 β Existence and the orthogonality condition
Let be any minimiser of over . (Existence in is guaranteed because is a finite-dimensional, hence closed, subspace.)
For any and any , the vector . By optimality of : Dividing by and letting gives . Repeating with gives the reverse inequality, so . Replacing by (still in ) shows the imaginary part also vanishes, hence for all .
Step 2 β Uniqueness
Suppose and both satisfy the orthogonality condition. Then , and Subtracting: , i.e. , so and .
Step 3 β Explicit formula via an orthonormal basis
Let be an orthonormal basis for and write . The orthogonality condition gives Hence .
Theorem: Pythagorean Theorem
If in , then More generally, for mutually orthogonal vectors :
Proof
Expand: since implies . The general case follows by induction.
Classical Gram--Schmidt Orthogonalization
Complexity:Numerical stability. Classical Gram--Schmidt (CGS) suffers from
catastrophic cancellation in floating-point arithmetic: the computed
vectors lose orthogonality as rounding errors accumulate. Modified
Gram--Schmidt (MGS) is algebraically identical but numerically superior.
In MGS, the projections in line 3 are subtracted one at a time, updating
after each subtraction rather than computing all
projections from the original . For production code (e.g.
QR factorisation in MATLAB/NumPy), Householder reflections or Givens
rotations are preferred.
Gram--Schmidt Orthogonalization Step by Step
Example: Gram--Schmidt on Three Vectors in
Apply the Gram--Schmidt procedure to the vectors to obtain an orthonormal basis for .
Step 1 β First orthonormal vector
Normalise :
Step 2 β Second orthonormal vector
Compute the projection of onto : Subtract: Normalise:
Step 3 β Third orthonormal vector
Compute projections of onto and : Subtract: Normalise:
Step 4 β Verification
We verify orthonormality by direct computation: Each vector has unit norm by construction. The orthonormal basis is
Historical Note: The Many Names of the Cauchy--Schwarz Inequality
19th centuryFew results in mathematics have been independently discovered β and named β as often as this one.
Augustin-Louis Cauchy (1821) proved the discrete inequality for real sequences in his Cours d'analyse. Viktor Bunyakovsky (1859) extended it to integrals, leading some authors (particularly in the Russian tradition) to call it the Cauchy--Bunyakovsky inequality. Hermann Amandus Schwarz (1885) independently proved the integral version with full rigour.
The inequality is therefore variously known as:
- Cauchy--Schwarz (most common in Western engineering literature),
- Cauchy--Bunyakovsky--Schwarz (CBS) (common in mathematical analysis),
- Schwarz inequality (in some functional-analysis texts).
The proof technique we presented β subtracting the projection and exploiting non-negativity of the squared norm β is essentially the one Schwarz used, and it generalises verbatim to arbitrary inner product spaces, including spaces of square-integrable functions fundamental to signal processing.
Common Mistake: Which Argument Is Conjugate-Linear?
Mistake:
Writing β treating the inner product as linear in both arguments. This leads to sign and phase errors in every derivation that involves complex scalars.
Correction:
Under our convention (linear in the first argument, following the notation table in NNotation for This Chapter): Always check which convention a reference uses before borrowing a formula. In particular, the projection formula becomes (numerator: the vector being projected goes in the first slot).
A quick sanity check: must be a real non-negative number. If your calculation yields a complex value, you have mixed up the convention.
Common Mistake: Classical Gram--Schmidt Loses Orthogonality
Mistake:
Implementing Gram--Schmidt exactly as written in AClassical Gram--Schmidt Orthogonalization in floating-point arithmetic and trusting that the output vectors are orthogonal to machine precision.
Correction:
Classical Gram--Schmidt (CGS) is numerically unstable: rounding errors cause the computed to drift away from orthogonality, often severely when the input vectors are nearly dependent.
Modified Gram--Schmidt (MGS) reorders the computation: instead of projecting onto all previous simultaneously, MGS subtracts each projection sequentially, updating the working vector after each subtraction. This yields the same result in exact arithmetic but reduces error propagation in finite precision.
For high-reliability implementations (QR decomposition, MIMO detection),
prefer Householder reflections or library routines
(numpy.linalg.qr, MATLAB qr), which are backward stable.
Orthogonalization in Production: QR Factorization
In production signal processing code, never implement Gram-Schmidt manually. Use QR factorization from a numerical linear algebra library:
- Python:
Q, R = numpy.linalg.qr(A)(Householder-based, backward stable) - MATLAB:
[Q, R] = qr(A)(same algorithm) - C/Fortran: LAPACK
dgeqrf/zgeqrf
Householder QR costs flops for an matrix β the same order as Modified Gram-Schmidt but with guaranteed backward stability. In MIMO detection, the QR decomposition of the channel matrix enables efficient successive interference cancellation (SIC) by back-substitution on the upper-triangular .
- β’
Classical Gram-Schmidt: loss of orthogonality proportional to
- β’
Modified Gram-Schmidt: loss proportional to
- β’
Householder QR: backward stable β orthogonality loss bounded by regardless of
Why This Matters: From Inner Products to Beamforming
When a base station with antennas transmits signal using beamforming vector , the received signal at a single-antenna user is where is the channel vector and is additive noise.
The effective channel gain is , which is the modulus of an inner product. Maximising this gain subject to the unit power constraint is a direct application of Cauchy--Schwarz: with equality when for any phase . The optimal beamformer is therefore the matched filter (or maximum-ratio transmission) beamformer This result generalises to multi-user settings (zero-forcing beamforming uses projections) and to receive combining (maximum-ratio combining).
See full treatment in Precoding with CSIT
Why This Matters: Orthogonal Projection as MMSE Estimation
The orthogonal projection theorem (TOrthogonal Projection Theorem) is the geometric backbone of linear minimum-mean-square-error (LMMSE) estimation.
Suppose we observe and wish to estimate by a linear function . Minimising the MSE is equivalent to requiring the estimation error to be orthogonal (in the stochastic inner product ) to the observation space spanned by β precisely the orthogonality principle. The solution is where and .
Every LMMSE channel estimator and equaliser in this book traces back to this projection.
See full treatment in Estimation Theory Fundamentals
Quick Check
Let and in . Does the Cauchy--Schwarz inequality hold with equality?
Yes, because .
No, strict inequality holds.
It depends on the choice of inner product convention.
Cannot be determined without computing.
, so . The vectors are linearly dependent and the equality condition is satisfied. Indeed: and .
Quick Check
Under the convention used in this book (linear in the first argument), what is for a nonzero ?
By linearity in the first argument: .
Inner product
A function satisfying conjugate symmetry, linearity in (at least) one argument, and positive definiteness. Equips a vector space with geometric notions of length, angle, and orthogonality.
Related: Norm, Inner Product on , Orthogonality
Norm
A function satisfying positive definiteness, absolute homogeneity, and the triangle inequality. The Euclidean norm is the norm induced by the standard inner product.
Related: Inner product, Norm Induced by an Inner Product, Norms
Orthogonal projection
The unique closest point in a subspace to a given vector . Characterised by the condition that the residual is orthogonal to every vector in . Computed via when has orthonormal columns spanning .
Related: Orthogonal Projection Theorem, Inner product, Orthogonal Projection as MMSE Estimation
Key Takeaway
-
The inner product is the fundamental tool for measuring similarity, length, and angle in . Our convention: linear in the first argument, conjugate-linear in the second.
-
Cauchy--Schwarz bounds the inner product by the product of norms and is the workhorse inequality of linear algebra. Its equality condition () directly gives the matched-filter / maximum-ratio-transmission beamformer.
-
Orthogonal projection onto a subspace is the unique best approximation in the least-squares sense. The orthogonality condition (residual subspace) underlies MMSE estimation, interference nulling, and subspace signal processing.
-
Gram--Schmidt converts any linearly independent set into an orthonormal basis. Use the modified variant or Householder reflections in numerical implementations.
-
The norm family () appears throughout communications: for energy, for sparsity promotion, and for per-antenna power constraints.