Singular Value Decomposition
Why SVD Is the Backbone of MIMO
If you remember only one matrix decomposition from this entire textbook, let it be the singular value decomposition (SVD). Every major result in multi-antenna wireless communications either uses SVD directly or is a consequence of it:
- Channel decomposition. The SVD of the channel matrix converts a coupled MIMO channel into parallel, independent scalar sub-channels. The singular values are the gains of these sub-channels.
- Capacity and water-filling. The MIMO capacity formula is expressed entirely in terms of singular values. Power allocation (water-filling) assigns power to sub-channels according to their singular values.
- Beamforming. The optimal transmit beamformer is the right singular vector corresponding to the largest singular value ; the optimal receive combiner is the left singular vector .
- Low-rank channel models. Real-world channels often have a small number of dominant scattering clusters, leading to a channel matrix with rapidly decaying singular values. The Eckart--Young theorem tells us the best rank- approximation in any unitarily invariant norm.
- Condition number. The ratio determines how sensitive the channel inversion (zero-forcing) is to noise. An ill-conditioned channel () means some sub-channels are nearly unusable.
Unlike eigendecomposition, the SVD applies to any matrix — square or rectangular, Hermitian or not. It is the universal tool of linear algebra, and mastering it is prerequisite to everything that follows in this book.
Definition: Singular Values and Singular Vectors
Singular Values and Singular Vectors
Let with .
The singular values of are the non-negative real numbers defined as the positive square roots of the nonzero eigenvalues of the Hermitian positive semidefinite matrix . That is, if are the nonzero eigenvalues of , then for .
The right singular vectors of are the orthonormal eigenvectors of , ordered so that for , and for .
The left singular vectors are defined by for . The remaining left singular vectors are any orthonormal basis of .
The singular values are intrinsic to : they do not depend on the choice of bases. The singular vectors, however, are not unique when singular values are repeated (one may choose any orthonormal basis within the corresponding subspace), and each singular vector is determined only up to a unit-modulus scalar .
Theorem: SVD Existence Theorem
Every matrix can be decomposed as where is unitary, is unitary, and is a (possibly rectangular) diagonal matrix with non-negative entries on the main diagonal, ordered as The diagonal entries are the singular values of , the columns of are the left singular vectors, and the columns of are the right singular vectors.
The SVD says that every linear map, no matter how complicated, is secretly just a rotation, followed by axis-aligned scaling, followed by another rotation. The right singular vectors are the "input directions" that the matrix treats independently; the left singular vectors are the corresponding "output directions"; and the singular values are the gains along each direction. For a MIMO channel, this means: transmit along , receive along , and the effective scalar gain is .
Start from the eigendecomposition of the Hermitian positive semidefinite matrix .
Construct as the matrix of eigenvectors of .
Define and construct the first columns of via .
Show that is orthonormal, extend to a full basis, and verify the decomposition.
Step 1: Eigendecompose $\mathbf{A}^H\mathbf{A}$
The matrix is Hermitian and positive semidefinite (for any , ). By the spectral theorem (TSpectral Theorem for Hermitian Matrices), there exists a unitary matrix and a real diagonal matrix with such that Let . Then exactly eigenvalues are strictly positive: and .
Why : Note that , since implies , hence . The converse is trivial. By the rank--nullity theorem, .
Step 2: Define the singular values and construct $\mathbf{U}$ (first $r$ columns)
Define the singular values
For each , define
Claim: The set is orthonormal.
Proof of claim: For , where we used and the orthonormality of .
Step 3: Extend $\mathbf{U}$ to a full unitary matrix
The vectors form an orthonormal set in . Since , we can extend this set to an orthonormal basis of (for instance by Gram--Schmidt applied to any completion).
Claim: spans .
Proof: Every has the form for some . Expand (since the form a basis). Then , where we used for and for (since for ). Therefore .
Conversely, each . Hence .
Define the unitary matrix
Step 4: Assemble and verify the decomposition
Define the matrix with for and all other entries zero, where for .
We must verify . Since is unitary, it suffices to show , i.e., to verify for every .
Case : By construction, , so .
Case : Since , we have . Also , so . Hence .
Therefore , which gives
Outer-product form. Equivalently, which expresses as a sum of rank-one matrices, each weighted by the corresponding singular value.
Step 5: Uniqueness of singular values (and non-uniqueness of singular vectors)
The singular values are uniquely determined by , because they are the positive square roots of the eigenvalues of , which are uniquely determined.
The singular vectors, however, are generally not unique:
-
Phase ambiguity. If is a singular vector pair, then so is for any , since .
-
Repeated singular values. If , then any orthonormal basis of the corresponding right (resp. left) singular subspace yields a valid SVD. This is analogous to eigenspaces of repeated eigenvalues in the spectral theorem.
SVD Geometry: Rotation--Scaling--Rotation
SVD in 3D: Unit Sphere to Ellipsoid
Geometric Interpretation: Rotation–Scaling–Rotation
The SVD reveals that every linear map decomposes into three elementary operations:
-
Rotation (and reflection) in the domain: is a unitary transformation that rotates the input space , aligning the coordinate axes with the right singular vectors .
-
Axis-aligned scaling: stretches each axis by , and also changes the dimension of the space (from to ) by padding with zeros or truncating.
-
Rotation (and reflection) in the codomain: rotates the output space , mapping the scaled standard basis vectors to the left singular vectors .
Unit sphere to ellipsoid. Consider the unit sphere . Under , this sphere maps to an ellipsoid (possibly degenerate) whose semi-axes have lengths and point along the directions .
This geometric picture is the reason SVD is so natural for MIMO: the channel maps the transmit signal sphere into an ellipsoid at the receiver, and the semi-axes of that ellipsoid are precisely the sub-channel gains.
Theorem: Eckart--Young Low-Rank Approximation Theorem
Let have SVD with and . For , define the rank- truncated SVD Then is a best rank- approximation of in the Frobenius norm: and the approximation error is Moreover, the same result holds for the spectral (operator) norm:
The SVD sorts the "energy" of a matrix by importance. The first singular vector pair captures the single rank-one matrix that best approximates . Adding more terms improves the approximation greedily, and the Eckart--Young theorem says this greedy approach is globally optimal. In wireless communications, if a channel has rapidly decaying singular values, only a few dominant modes carry significant energy, and the channel is effectively low-rank.
Express the Frobenius norm of the difference in terms of singular values.
Use the unitary invariance of the Frobenius norm.
Show that any rank- matrix must have a null space of dimension .
Exploit the Courant--Fischer min-max principle for the spectral norm case.
Step 1: Express $\|\mathbf{A} - \mathbf{A}_k\|_F^2$ in terms of singular values
Since and , we have The Frobenius norm satisfies . Computing: where we used and .
Step 2: Unitary invariance of the Frobenius norm
Recall that the Frobenius norm is unitarily invariant: for any unitary and . This is because .
Consequently, for any , where has .
The problem thus reduces to: among all matrices of rank at most , which one minimizes ?
Step 3: Lower bound via the subspace intersection argument
Let be any matrix with . Then .
Consider the -dimensional subspace . By a dimension argument: So there exists a unit vector . Write with .
Since :
This establishes . For the Frobenius norm result, we need a tighter bound. Since the inequality holds for a single vector, we apply the argument to orthogonal vectors in .
Let be an orthonormal basis of and form . Then By the Cauchy interlacing inequalities (a consequence of Courant--Fischer), the eigenvalues of the matrix satisfy for each . Therefore
Step 4: $\mathbf{A}_k$ achieves the lower bound
From Step 1, . From Step 3, for any rank- matrix , .
Since has rank at most (it is a sum of rank-one matrices) and achieves the lower bound, it is a minimizer:
Spectral norm case. The spectral (operator) norm is also unitarily invariant. The subspace intersection argument from Step 3 (applied with a single vector ) gives . Since (the largest singular value of is ), the bound is tight.
SVD Geometry: Rotation–Scaling–Rotation
Visualize how SVD transforms the unit sphere through (rotate), (scale to ellipsoid), (rotate again). At each step the transformed surface is shown, along with the singular vector directions and their corresponding singular values as axis labels.
Parameters
Singular Values of a Parameterized Channel Matrix
Explore how singular values change as the channel matrix varies. The condition number indicates how well-conditioned the channel is. A fully uncorrelated channel () has nearly equal singular values, while a highly correlated channel () has one dominant singular value and the rest collapse toward zero — resulting in a near-rank-one channel with dramatically reduced spatial multiplexing gain.
Parameters
Correlation between antenna elements
Progressive Rank- Approximation
Watch how adding successive rank-1 terms progressively reconstructs the original matrix. At each frame , the approximation is displayed alongside the Frobenius-norm error . The Eckart--Young theorem guarantees this is the optimal rank- approximation.
Parameters
Eigendecomposition vs. Singular Value Decomposition
| Property | Eigendecomposition | Singular Value Decomposition (SVD) |
|---|---|---|
| Applicable to | Square matrices () only | Any matrix (), square or rectangular |
| Decomposition form | (if diagonalizable) | (always exists) |
| Factor structure | generally not unitary; complex diagonal | both unitary; real non-negative diagonal |
| Existence | Not guaranteed (defective matrices have no eigendecomposition) | Always exists for every matrix |
| Spectrum | Eigenvalues (complex in general) | Singular values (always real, non-negative) |
| Geometric meaning | Directions scaled without rotation (eigenvectors) | Rotation scaling rotation (unit sphere ellipsoid) |
| Relation | Eigenvalues of are | For Hermitian : (eigenvalues = singular values) |
| Numerical stability | Can be ill-conditioned for non-normal matrices | Always numerically stable (backward-stable algorithms exist) |
| Low-rank approximation | No direct optimality guarantee | Truncated SVD is optimal (Eckart--Young theorem) |
| Telecom application | Covariance eigenanalysis, PCA, stability analysis | MIMO channel decomposition, beamforming, channel estimation |
Example: SVD of a Matrix
Compute the full SVD of
Step 1: Compute $\mathbf{A}^T \mathbf{A}$
Since is real, . Compute: This is a real symmetric matrix.
Step 2: Find eigenvalues of $\mathbf{A}^T \mathbf{A}$
The characteristic polynomial is: Expanding along the first row:
The eigenvalues of are , , .
Sanity check: .
The singular values are and . The rank is .
Step 3: Find right singular vectors (eigenvectors of $\mathbf{A}^T \mathbf{A}$)
For : Solve : From rows 1 and 3: and , so . Row 2: .
For : Solve : This gives and .
For : . Solve : and , so .
Verify orthogonality: , , .
Step 4: Compute left singular vectors
\mathbf{u}_1^T \mathbf{u}_2 = \frac{1}{2}(1 - 1) = 0\checkmarkm = 2 = r\mathbf{U}$ is already complete: no extension needed.
Step 5: Assemble the SVD
|\mathbf{A} - \mathbf{A}_1|_F = \sigma_2 = 1$.
Historical Note: From Beltrami to Eckart--Young: The History of SVD
1873–1965The singular value decomposition has a remarkably long and multi-threaded history, with key ideas discovered independently by several mathematicians:
Beltrami (1873). Eugenio Beltrami, an Italian mathematician best known for his work in differential geometry, was the first to consider the decomposition of a bilinear form into canonical form. In his 1873 paper "Sulle funzioni bilineari," he showed that a real bilinear form can be reduced to a sum of products of orthogonal linear forms — essentially the real SVD for square matrices. He identified the singular values as the positive square roots of the eigenvalues of .
Jordan (1874). Just one year later, Camille Jordan independently obtained the same canonical decomposition. Jordan, already famous for the Jordan normal form (1870), approached the problem from the theory of bilinear and quadratic forms. His proof, published in the Journal de math'{e}matiques pures et appliqu'{e}es, used different techniques from Beltrami's but arrived at the same result.
Schmidt (1907). Erhard Schmidt extended the SVD to integral operators (compact operators on function spaces), presaging the modern functional-analytic viewpoint. His work introduced the "Schmidt pairs" (singular vector pairs) and established the convergence of the singular-value expansion.
Eckart and Young (1936). Carl Eckart and Gale Young proved the optimality of the truncated SVD for low-rank approximation. Their 1936 paper "The Approximation of One Matrix by Another of Lower Rank" in Psychometrika showed that the best rank- approximation (in the Frobenius norm) is obtained by keeping the largest singular values. This result is now fundamental in data compression, signal processing, and machine learning.
Golub and Kahan (1965). Gene Golub and William Kahan developed the first numerically stable algorithm for computing the SVD, based on bidiagonalization followed by QR-type iterations. This algorithmic breakthrough made the SVD practical for large-scale computation and is the ancestor of the SVD routines in LAPACK and MATLAB.
SVD Computation: Cost and Algorithm Choice
The SVD of an matrix (with ) costs flops using the Golub–Kahan bidiagonalization algorithm (LAPACK's dgesvd). For typical MIMO dimensions:
- MIMO: ~256 flops (negligible)
- massive MIMO: ~ flops (fast)
- XL-MIMO: ~ flops (requires optimization)
When only the top singular values/vectors are needed (e.g., for rank- channel approximation), use truncated SVD (
scipy.sparse.linalg.svds), which costs — a major savings when . In real-time MIMO receivers, the full SVD is computed once per coherence interval (1 ms in 5G NR at 30 kHz SCS). At dimensions and 1 ms update rate, this is well within the computational budget of a modern baseband processor.
- •
LAPACK dgesvd: flops for the full SVD
- •
5G NR slot duration at 30 kHz SCS: 0.5 ms (14 OFDM symbols)
- •
For real-time: prefer economic SVD ( is , not ) to save memory
Key Takeaway
The SVD is the decomposition for wireless communications. For any channel matrix : (1) the singular values are the sub-channel gains; (2) the right singular vectors are the optimal transmit directions (beamformers); (3) the left singular vectors are the optimal receive combiners; (4) the number of nonzero singular values equals the spatial multiplexing rank; and (5) the condition number governs sensitivity to noise. Unlike eigendecomposition, SVD works for any matrix — square or rectangular, Hermitian or not — making it the universally applicable tool. Every capacity formula, every beamforming design, and every channel estimation algorithm in MIMO communications is, at its core, an application of SVD.
Why This Matters: MIMO Channel SVD: Parallel Sub-channels
Consider a narrowband MIMO system with transmit and receive antennas: where is the channel matrix and .
Let be the SVD with singular values and .
Precoding and combining. If the transmitter precodes with (i.e., transmits ) and the receiver applies (i.e., forms ), then: where has the same distribution as (since is unitary and preserves the i.i.d.\ Gaussian distribution).
This decouples into independent scalar sub-channels:
The capacity with full CSI at both ends is therefore: where the power allocation is determined by water-filling: subject to .
Key insight: The SVD simultaneously diagonalizes the channel, orthogonalizes the noise, and reveals the optimal signaling directions. No other decomposition achieves all three.
See full treatment in MIMO Capacity: Deterministic Channels
Common Mistake: Singular Values Are Not Eigenvalues
Mistake:
A common error is to conflate the singular values of a matrix with its eigenvalues. Students often write or assume that the SVD and eigendecomposition produce the same factors.
Correction:
Singular values and eigenvalues are distinct concepts that coincide only in special cases:
1. Different definitions. Eigenvalues satisfy (require square). Singular values are (work for any matrix).
2. in general. For the matrix , both eigenvalues are , yet .
3. When they do agree.
- Hermitian PSD matrices: implies (eigenvalues are the singular values).
- Hermitian matrices (general): .
- Normal matrices: (with matching ordering of in decreasing order).
- Non-normal matrices: No simple relation. The singular values depend on the interaction between and , not just on the eigenvalues.
4. Practical consequence. In MIMO, the channel gains are the singular values of , not its eigenvalues (which may not even exist if is rectangular). The eigenvalues of are , which is a related but different quantity.
Singular Value
For a matrix , the -th singular value is the positive square root of the -th largest eigenvalue of . Equivalently, is the -th semi-axis length of the ellipsoid obtained by mapping the unit sphere through . Singular values are always real and non-negative, ordered .
Related: singular vector, SVD Existence Theorem, Eigenvalue and Eigenvector, Frobenius Norm
Frobenius Norm
The Frobenius norm of a matrix is where are the singular values. The Frobenius norm is unitarily invariant: for any unitary and .
Related: Singular Value, spectral norm, Trace of a Matrix
Condition Number
The condition number of a matrix with respect to the 2-norm is where is the largest singular value and is the smallest nonzero singular value. A matrix with is well-conditioned; indicates ill-conditioning. In MIMO, the condition number of the channel matrix determines how sensitive zero-forcing receivers are to noise amplification.
Related: Singular Value, Why SVD Is the Backbone of MIMO, Null Space and Zero-Forcing in Multiuser MIMO
Quick Check
Let have singular values and . What is ?
. The Frobenius norm equals the -norm of the singular value vector.
Quick Check
A MIMO channel matrix has singular values . What is the Frobenius-norm error of the best rank-2 approximation ?
By the Eckart--Young theorem, .