Ferkans — Interactive Telecom Tutor

Why SVD Is the Backbone of MIMO

If you remember only one matrix decomposition from this entire textbook, let it be the singular value decomposition (SVD). Every major result in multi-antenna wireless communications either uses SVD directly or is a consequence of it:

Channel decomposition. The SVD of the channel matrix $\mathbf{H} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H$ converts a coupled MIMO channel into parallel, independent scalar sub-channels. The singular values $\sigma_i$ are the gains of these sub-channels.
Capacity and water-filling. The MIMO capacity formula $C = \sum_i \log_2(1 + p_i \sigma_i^2 / \sigma_n^2)$ is expressed entirely in terms of singular values. Power allocation (water-filling) assigns power to sub-channels according to their singular values.
Beamforming. The optimal transmit beamformer is the right singular vector $\mathbf{v}_1$ corresponding to the largest singular value $\sigma_1$ ; the optimal receive combiner is the left singular vector $\mathbf{u}_1$ .
Low-rank channel models. Real-world channels often have a small number of dominant scattering clusters, leading to a channel matrix with rapidly decaying singular values. The Eckart--Young theorem tells us the best rank- $k$ approximation in any unitarily invariant norm.
Condition number. The ratio $\kappa = \sigma_1 / \sigma_r$ determines how sensitive the channel inversion (zero-forcing) is to noise. An ill-conditioned channel ( $\kappa \gg 1$ ) means some sub-channels are nearly unusable.

Unlike eigendecomposition, the SVD applies to any matrix — square or rectangular, Hermitian or not. It is the universal tool of linear algebra, and mastering it is prerequisite to everything that follows in this book.

Definition:
Singular Values and Singular Vectors

Let $\mathbf{A} \in \mathbb{C}^{m \times n}$ with $r = \operatorname{rank}(\mathbf{A})$ .

The singular values of $\mathbf{A}$ are the non-negative real numbers $\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0,$ defined as the positive square roots of the nonzero eigenvalues of the Hermitian positive semidefinite matrix $\mathbf{A}^H \mathbf{A} \in \mathbb{C}^{n \times n}$ . That is, if $\lambda_1 \geq \cdots \geq \lambda_r > 0$ are the nonzero eigenvalues of $\mathbf{A}^H \mathbf{A}$ , then $\sigma_i = \sqrt{\lambda_i}$ for $i = 1, \ldots, r$ .

The right singular vectors of $\mathbf{A}$ are the orthonormal eigenvectors $\mathbf{v}_1, \ldots, \mathbf{v}_n \in \mathbb{C}^n$ of $\mathbf{A}^H \mathbf{A}$ , ordered so that $\mathbf{A}^H \mathbf{A} \mathbf{v}_i = \sigma_i^2 \mathbf{v}_i$ for $i \leq r$ , and $\mathbf{A}^H \mathbf{A} \mathbf{v}_i = \mathbf{0}$ for $i > r$ .

The left singular vectors $\mathbf{u}_1, \ldots, \mathbf{u}_m \in \mathbb{C}^m$ are defined by $\mathbf{u}_i = \frac{1}{\sigma_i} \mathbf{A} \mathbf{v}_i$ for $i = 1, \ldots, r$ . The remaining $m - r$ left singular vectors are any orthonormal basis of $\mathcal{N}(\mathbf{A}^H) = \mathcal{R}(\mathbf{A})^{\perp}$ .

The singular values are intrinsic to $\mathbf{A}$ : they do not depend on the choice of bases. The singular vectors, however, are not unique when singular values are repeated (one may choose any orthonormal basis within the corresponding subspace), and each singular vector is determined only up to a unit-modulus scalar $e^{j\theta}$ .

Theorem: SVD Existence Theorem

Every matrix $\mathbf{A} \in \mathbb{C}^{m \times n}$ can be decomposed as $\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H,$ where $\mathbf{U} \in \mathbb{C}^{m \times m}$ is unitary, $\mathbf{V} \in \mathbb{C}^{n \times n}$ is unitary, and $\mathbf{\Sigma} \in \mathbb{R}^{m \times n}$ is a (possibly rectangular) diagonal matrix with non-negative entries on the main diagonal, ordered as $\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_{\min(m,n)} \geq 0.$ The diagonal entries $\sigma_i$ are the singular values of $\mathbf{A}$ , the columns of $\mathbf{U}$ are the left singular vectors, and the columns of $\mathbf{V}$ are the right singular vectors.

The SVD says that every linear map, no matter how complicated, is secretly just a rotation, followed by axis-aligned scaling, followed by another rotation. The right singular vectors $\mathbf{v}_i$ are the "input directions" that the matrix treats independently; the left singular vectors $\mathbf{u}_i$ are the corresponding "output directions"; and the singular values $\sigma_i$ are the gains along each direction. For a MIMO channel, this means: transmit along $\mathbf{v}_i$ , receive along $\mathbf{u}_i$ , and the effective scalar gain is $\sigma_i$ .

Show Hint

Start from the eigendecomposition of the Hermitian positive semidefinite matrix $\mathbf{A}^H \mathbf{A}$ .

Construct $\mathbf{V}$ as the matrix of eigenvectors of $\mathbf{A}^H \mathbf{A}$ .

Define $\sigma_i = \sqrt{\lambda_i}$ and construct the first $r$ columns of $\mathbf{U}$ via $\mathbf{u}_i = \mathbf{A}\mathbf{v}_i / \sigma_i$ .

Show that $\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}$ is orthonormal, extend to a full basis, and verify the decomposition.

Proof

Step 1: Eigendecompose $\mathbf{A}^H\mathbf{A}$

The matrix $\mathbf{A}^H \mathbf{A} \in \mathbb{C}^{n \times n}$ is Hermitian and positive semidefinite (for any $\mathbf{x}$ , $\mathbf{x}^H (\mathbf{A}^H \mathbf{A}) \mathbf{x} = \|\mathbf{A}\mathbf{x}\|^2 \geq 0$ ). By the spectral theorem (TSpectral Theorem for Hermitian Matrices), there exists a unitary matrix $\mathbf{V} = [\mathbf{v}_1 \mid \cdots \mid \mathbf{v}_n] \in \mathbb{C}^{n \times n}$ and a real diagonal matrix $\mathbf{D} = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)$ with $\lambda_1 \geq \cdots \geq \lambda_n \geq 0$ such that $\mathbf{A}^H \mathbf{A} = \mathbf{V} \mathbf{D} \mathbf{V}^H.$ Let $r = \operatorname{rank}(\mathbf{A})$ . Then exactly $r$ eigenvalues are strictly positive: $\lambda_1 \geq \cdots \geq \lambda_r > 0$ and $\lambda_{r+1} = \cdots = \lambda_n = 0$ .

Why $\operatorname{rank}(\mathbf{A}^H\mathbf{A}) = \operatorname{rank}(\mathbf{A})$ : Note that $\mathcal{N}(\mathbf{A}^H \mathbf{A}) = \mathcal{N}(\mathbf{A})$ , since $\mathbf{A}^H \mathbf{A} \mathbf{x} = \mathbf{0}$ implies $\|\mathbf{A}\mathbf{x}\|^2 = \mathbf{x}^H \mathbf{A}^H \mathbf{A} \mathbf{x} = 0$ , hence $\mathbf{A}\mathbf{x} = \mathbf{0}$ . The converse is trivial. By the rank--nullity theorem, $\operatorname{rank}(\mathbf{A}^H\mathbf{A}) = \operatorname{rank}(\mathbf{A}) = r$ .

Step 2: Define the singular values and construct $\mathbf{U}$ (first $r$ columns)

Define the singular values $\sigma_i = \sqrt{\lambda_i} > 0, \qquad i = 1, \ldots, r.$

For each $i = 1, \ldots, r$ , define $\mathbf{u}_i = \frac{1}{\sigma_i} \mathbf{A} \mathbf{v}_i \in \mathbb{C}^m.$

Claim: The set $\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}$ is orthonormal.

Proof of claim: For $1 \leq i, k \leq r$ , $\mathbf{u}_i^H \mathbf{u}_k = \frac{1}{\sigma_i \sigma_k} (\mathbf{A}\mathbf{v}_i)^H (\mathbf{A}\mathbf{v}_k) = \frac{1}{\sigma_i \sigma_k} \mathbf{v}_i^H (\mathbf{A}^H \mathbf{A}) \mathbf{v}_k = \frac{1}{\sigma_i \sigma_k} \mathbf{v}_i^H (\sigma_k^2 \mathbf{v}_k) = \frac{\sigma_k}{\sigma_i} \mathbf{v}_i^H \mathbf{v}_k = \frac{\sigma_k}{\sigma_i} \delta_{ik} = \delta_{ik},$ where we used $\mathbf{A}^H \mathbf{A} \mathbf{v}_k = \lambda_k \mathbf{v}_k = \sigma_k^2 \mathbf{v}_k$ and the orthonormality of $\{\mathbf{v}_i\}$ .

Step 3: Extend $\mathbf{U}$ to a full unitary matrix

The vectors $\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}$ form an orthonormal set in $\mathbb{C}^m$ . Since $r \leq m$ , we can extend this set to an orthonormal basis $\{\mathbf{u}_1, \ldots, \mathbf{u}_r, \mathbf{u}_{r+1}, \ldots, \mathbf{u}_m\}$ of $\mathbb{C}^m$ (for instance by Gram--Schmidt applied to any completion).

Claim: $\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}$ spans $\mathcal{R}(\mathbf{A})$ .

Proof: Every $\mathbf{y} \in \mathcal{R}(\mathbf{A})$ has the form $\mathbf{y} = \mathbf{A}\mathbf{x}$ for some $\mathbf{x} \in \mathbb{C}^n$ . Expand $\mathbf{x} = \sum_{i=1}^n c_i \mathbf{v}_i$ (since the $\mathbf{v}_i$ form a basis). Then $\mathbf{y} = \sum_{i=1}^n c_i \mathbf{A}\mathbf{v}_i = \sum_{i=1}^r c_i \sigma_i \mathbf{u}_i$ , where we used $\mathbf{A}\mathbf{v}_i = \sigma_i \mathbf{u}_i$ for $i \leq r$ and $\mathbf{A}\mathbf{v}_i = \mathbf{0}$ for $i > r$ (since $\mathbf{v}_i \in \mathcal{N}(\mathbf{A})$ for $i > r$ ). Therefore $\mathbf{y} \in \operatorname{span}\{\mathbf{u}_1, \ldots, \mathbf{u}_r\}$ .

Conversely, each $\mathbf{u}_i = \frac{1}{\sigma_i}\mathbf{A}\mathbf{v}_i \in \mathcal{R}(\mathbf{A})$ . Hence $\operatorname{span}\{\mathbf{u}_1, \ldots, \mathbf{u}_r\} = \mathcal{R}(\mathbf{A})$ .

Define the unitary matrix $\mathbf{U} = [\mathbf{u}_1 \mid \cdots \mid \mathbf{u}_m] \in \mathbb{C}^{m \times m}.$

Step 4: Assemble and verify the decomposition

Define the $m \times n$ matrix $\mathbf{\Sigma}$ with $(\mathbf{\Sigma})_{ii} = \sigma_i$ for $i = 1, \ldots, \min(m,n)$ and all other entries zero, where $\sigma_i = 0$ for $i > r$ .

We must verify $\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H$ . Since $\mathbf{V}$ is unitary, it suffices to show $\mathbf{A}\mathbf{V} = \mathbf{U}\mathbf{\Sigma}$ , i.e., to verify $\mathbf{A}\mathbf{v}_i = \sigma_i \mathbf{u}_i$ for every $i$ .

Case $1 \leq i \leq r$ : By construction, $\mathbf{u}_i = \frac{1}{\sigma_i}\mathbf{A}\mathbf{v}_i$ , so $\mathbf{A}\mathbf{v}_i = \sigma_i \mathbf{u}_i$ . $\checkmark$

Case $r < i \leq n$ : Since $\mathbf{v}_i \in \mathcal{N}(\mathbf{A}^H\mathbf{A}) = \mathcal{N}(\mathbf{A})$ , we have $\mathbf{A}\mathbf{v}_i = \mathbf{0}$ . Also $\sigma_i = 0$ , so $\sigma_i \mathbf{u}_i = \mathbf{0}$ . Hence $\mathbf{A}\mathbf{v}_i = \mathbf{0} = \sigma_i \mathbf{u}_i$ . $\checkmark$

Therefore $\mathbf{A}\mathbf{V} = \mathbf{U}\mathbf{\Sigma}$ , which gives $\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H.$

Outer-product form. Equivalently, $\mathbf{A} = \sum_{i=1}^{r} \sigma_i \, \mathbf{u}_i \mathbf{v}_i^H,$ which expresses $\mathbf{A}$ as a sum of $r$ rank-one matrices, each weighted by the corresponding singular value. $\blacksquare$

Step 5: Uniqueness of singular values (and non-uniqueness of singular vectors)

The singular values $\sigma_1 \geq \cdots \geq \sigma_r > 0$ are uniquely determined by $\mathbf{A}$ , because they are the positive square roots of the eigenvalues of $\mathbf{A}^H \mathbf{A}$ , which are uniquely determined.

The singular vectors, however, are generally not unique:

Phase ambiguity. If $(\mathbf{u}_i, \mathbf{v}_i)$ is a singular vector pair, then so is $(e^{j\theta}\mathbf{u}_i, e^{j\theta}\mathbf{v}_i)$ for any $\theta \in \mathbb{R}$ , since $\mathbf{A}(e^{j\theta}\mathbf{v}_i) = e^{j\theta}\sigma_i \mathbf{u}_i = \sigma_i (e^{j\theta}\mathbf{u}_i)$ .
Repeated singular values. If $\sigma_i = \sigma_{i+1} = \cdots = \sigma_{i+p-1}$ , then any orthonormal basis of the corresponding right (resp. left) singular subspace yields a valid SVD. This is analogous to eigenspaces of repeated eigenvalues in the spectral theorem.

SVD Geometry: Rotation--Scaling--Rotation

The SVD transforms the unit circle through three stages:

\mathbf{V}^H

rotates,

\mathbf{\Sigma}

scales to an ellipse, and

\mathbf{U}

rotates to the final image. The singular values are the semi-axis lengths of the ellipse.

For any matrix

\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H

, the action on the unit circle reveals the geometric essence of the SVD.

SVD in 3D: Unit Sphere to Ellipsoid

The same SVD transformation in three dimensions: the unit sphere is stretched into an ellipsoid along the principal axes. The three singular values are the semi-axis lengths. The camera rotates to reveal the full 3D structure.

For a

3 \times 3

matrix, the SVD maps the unit sphere to an ellipsoid whose semi-axes have lengths

\sigma_1, \sigma_2, \sigma_3

.

Geometric Interpretation: Rotation–Scaling–Rotation

The SVD $\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H$ reveals that every linear map decomposes into three elementary operations:

Rotation (and reflection) in the domain: $\mathbf{V}^H$ is a unitary transformation that rotates the input space $\mathbb{C}^n$ , aligning the coordinate axes with the right singular vectors $\mathbf{v}_1, \ldots, \mathbf{v}_n$ .
Axis-aligned scaling: $\mathbf{\Sigma}$ stretches each axis by $\sigma_i$ , and also changes the dimension of the space (from $n$ to $m$ ) by padding with zeros or truncating.
Rotation (and reflection) in the codomain: $\mathbf{U}$ rotates the output space $\mathbb{C}^m$ , mapping the scaled standard basis vectors to the left singular vectors $\mathbf{u}_1, \ldots, \mathbf{u}_m$ .

Unit sphere to ellipsoid. Consider the unit sphere $\mathcal{S} = \{\mathbf{x} \in \mathbb{C}^n : \|\mathbf{x}\| = 1\}$ . Under $\mathbf{A}$ , this sphere maps to an ellipsoid (possibly degenerate) $\mathbf{A}(\mathcal{S}) = \{\mathbf{A}\mathbf{x} : \|\mathbf{x}\| = 1\}$ whose semi-axes have lengths $\sigma_1, \sigma_2, \ldots, \sigma_r$ and point along the directions $\mathbf{u}_1, \mathbf{u}_2, \ldots, \mathbf{u}_r$ .

This geometric picture is the reason SVD is so natural for MIMO: the channel $\mathbf{H}$ maps the transmit signal sphere into an ellipsoid at the receiver, and the semi-axes of that ellipsoid are precisely the sub-channel gains.

Theorem: Eckart--Young Low-Rank Approximation Theorem

Let $\mathbf{A} \in \mathbb{C}^{m \times n}$ have SVD $\mathbf{A} = \sum_{i=1}^{r} \sigma_i \, \mathbf{u}_i \mathbf{v}_i^H$ with $\sigma_1 \geq \cdots \geq \sigma_r > 0$ and $r = \operatorname{rank}(\mathbf{A})$ . For $1 \leq k \leq r$ , define the rank- $k$ truncated SVD $\mathbf{A}_k = \sum_{i=1}^{k} \sigma_i \, \mathbf{u}_i \mathbf{v}_i^H.$ Then $\mathbf{A}_k$ is a best rank- $k$ approximation of $\mathbf{A}$ in the Frobenius norm: $\mathbf{A}_k = \arg\min_{\substack{\mathbf{B} \in \mathbb{C}^{m \times n} \\ \operatorname{rank}(\mathbf{B}) \leq k}} \|\mathbf{A} - \mathbf{B}\|_F,$ and the approximation error is $\|\mathbf{A} - \mathbf{A}_k\|_F = \sqrt{\sigma_{k+1}^2 + \sigma_{k+2}^2 + \cdots + \sigma_r^2}.$ Moreover, the same result holds for the spectral (operator) norm: $\min_{\substack{\mathbf{B} \\ \operatorname{rank}(\mathbf{B}) \leq k}} \|\mathbf{A} - \mathbf{B}\|_2 = \sigma_{k+1}.$

The SVD sorts the "energy" of a matrix by importance. The first singular vector pair $(\mathbf{u}_1, \mathbf{v}_1)$ captures the single rank-one matrix that best approximates $\mathbf{A}$ . Adding more terms improves the approximation greedily, and the Eckart--Young theorem says this greedy approach is globally optimal. In wireless communications, if a channel has rapidly decaying singular values, only a few dominant modes carry significant energy, and the channel is effectively low-rank.

Show Hint

Express the Frobenius norm of the difference in terms of singular values.

Use the unitary invariance of the Frobenius norm.

Show that any rank- $k$ matrix $\mathbf{B}$ must have a null space of dimension $\geq n - k$ .

Exploit the Courant--Fischer min-max principle for the spectral norm case.

Proof

Step 1: Express $\|\mathbf{A} - \mathbf{A}_k\|_F^2$ in terms of singular values

Since $\mathbf{A} = \sum_{i=1}^{r} \sigma_i \, \mathbf{u}_i \mathbf{v}_i^H$ and $\mathbf{A}_k = \sum_{i=1}^{k} \sigma_i \, \mathbf{u}_i \mathbf{v}_i^H$ , we have $\mathbf{A} - \mathbf{A}_k = \sum_{i=k+1}^{r} \sigma_i \, \mathbf{u}_i \mathbf{v}_i^H.$ The Frobenius norm satisfies $\|\mathbf{M}\|_F^2 = \operatorname{tr}(\mathbf{M}^H \mathbf{M})$ . Computing: $\|\mathbf{A} - \mathbf{A}_k\|_F^2 = \operatorname{tr}\!\left(\sum_{i=k+1}^{r} \sum_{l=k+1}^{r} \sigma_i \sigma_l \, \mathbf{v}_i \mathbf{u}_i^H \mathbf{u}_l \mathbf{v}_l^H\right) = \sum_{i=k+1}^{r} \sigma_i^2,$ where we used $\mathbf{u}_i^H \mathbf{u}_l = \delta_{il}$ and $\operatorname{tr}(\mathbf{v}_i \mathbf{v}_i^H) = \mathbf{v}_i^H \mathbf{v}_i = 1$ .

Step 2: Unitary invariance of the Frobenius norm

Recall that the Frobenius norm is unitarily invariant: $\|\mathbf{M}\|_F = \|\mathbf{P}\mathbf{M}\mathbf{Q}\|_F$ for any unitary $\mathbf{P}$ and $\mathbf{Q}$ . This is because $\|\mathbf{P}\mathbf{M}\mathbf{Q}\|_F^2 = \operatorname{tr}(\mathbf{Q}^H \mathbf{M}^H \mathbf{P}^H \mathbf{P} \mathbf{M} \mathbf{Q}) = \operatorname{tr}(\mathbf{Q}^H \mathbf{M}^H \mathbf{M} \mathbf{Q}) = \operatorname{tr}(\mathbf{M}^H \mathbf{M} \mathbf{Q} \mathbf{Q}^H) = \operatorname{tr}(\mathbf{M}^H \mathbf{M}) = \|\mathbf{M}\|_F^2$ .

Consequently, for any $\mathbf{B} \in \mathbb{C}^{m \times n}$ , $\|\mathbf{A} - \mathbf{B}\|_F = \|\mathbf{U}^H(\mathbf{A} - \mathbf{B})\mathbf{V}\|_F = \|\mathbf{\Sigma} - \tilde{\mathbf{B}}\|_F,$ where $\tilde{\mathbf{B}} = \mathbf{U}^H \mathbf{B} \mathbf{V}$ has $\operatorname{rank}(\tilde{\mathbf{B}}) = \operatorname{rank}(\mathbf{B}) \leq k$ .

The problem thus reduces to: among all $m \times n$ matrices $\tilde{\mathbf{B}}$ of rank at most $k$ , which one minimizes $\|\mathbf{\Sigma} - \tilde{\mathbf{B}}\|_F$ ?

Step 3: Lower bound via the subspace intersection argument

Let $\mathbf{B}$ be any matrix with $\operatorname{rank}(\mathbf{B}) \leq k$ . Then $\dim \mathcal{N}(\mathbf{B}) \geq n - k$ .

Consider the $(k+1)$ -dimensional subspace $\mathcal{V}_{k+1} = \operatorname{span}\{\mathbf{v}_1, \ldots, \mathbf{v}_{k+1}\}$ . By a dimension argument: $\dim(\mathcal{N}(\mathbf{B}) \cap \mathcal{V}_{k+1}) \geq \dim \mathcal{N}(\mathbf{B}) + \dim \mathcal{V}_{k+1} - n \geq (n - k) + (k + 1) - n = 1.$ So there exists a unit vector $\mathbf{z} \in \mathcal{N}(\mathbf{B}) \cap \mathcal{V}_{k+1}$ . Write $\mathbf{z} = \sum_{i=1}^{k+1} \alpha_i \mathbf{v}_i$ with $\sum_{i=1}^{k+1} |\alpha_i|^2 = 1$ .

Since $\mathbf{B}\mathbf{z} = \mathbf{0}$ : $\|\mathbf{A} - \mathbf{B}\|_F^2 \geq \|(\mathbf{A} - \mathbf{B})\mathbf{z}\|^2 = \|\mathbf{A}\mathbf{z}\|^2 = \left\|\sum_{i=1}^{k+1} \alpha_i \sigma_i \mathbf{u}_i\right\|^2 = \sum_{i=1}^{k+1} |\alpha_i|^2 \sigma_i^2 \geq \sigma_{k+1}^2 \sum_{i=1}^{k+1} |\alpha_i|^2 = \sigma_{k+1}^2.$

This establishes $\|\mathbf{A} - \mathbf{B}\|_F^2 \geq \sigma_{k+1}^2$ . For the Frobenius norm result, we need a tighter bound. Since the inequality $\|\mathbf{A} - \mathbf{B}\|_F^2 \geq \|(\mathbf{A} - \mathbf{B})\mathbf{z}\|^2$ holds for a single vector, we apply the argument to $n - k$ orthogonal vectors in $\mathcal{N}(\mathbf{B})$ .

Let $\{\mathbf{z}_1, \ldots, \mathbf{z}_{n-k}\}$ be an orthonormal basis of $\mathcal{N}(\mathbf{B})$ and form $\mathbf{Z} = [\mathbf{z}_1 \mid \cdots \mid \mathbf{z}_{n-k}]$ . Then $\|\mathbf{A} - \mathbf{B}\|_F^2 \geq \|(\mathbf{A} - \mathbf{B})\mathbf{Z}\|_F^2 = \|\mathbf{A}\mathbf{Z}\|_F^2 = \operatorname{tr}(\mathbf{Z}^H \mathbf{A}^H \mathbf{A} \mathbf{Z}).$ By the Cauchy interlacing inequalities (a consequence of Courant--Fischer), the eigenvalues of the $(n-k) \times (n-k)$ matrix $\mathbf{Z}^H \mathbf{A}^H \mathbf{A} \mathbf{Z}$ satisfy $\mu_i \geq \lambda_{k+i}(\mathbf{A}^H\mathbf{A}) = \sigma_{k+i}^2$ for each $i = 1, \ldots, n-k$ . Therefore $\operatorname{tr}(\mathbf{Z}^H \mathbf{A}^H \mathbf{A} \mathbf{Z}) = \sum_{i=1}^{n-k} \mu_i \geq \sum_{i=1}^{n-k} \sigma_{k+i}^2 = \sum_{i=k+1}^{r} \sigma_i^2.$

Step 4: $\mathbf{A}_k$ achieves the lower bound

From Step 1, $\|\mathbf{A} - \mathbf{A}_k\|_F^2 = \sum_{i=k+1}^{r} \sigma_i^2$ . From Step 3, for any rank- $\leq k$ matrix $\mathbf{B}$ , $\|\mathbf{A} - \mathbf{B}\|_F^2 \geq \sum_{i=k+1}^{r} \sigma_i^2$ .

Since $\mathbf{A}_k$ has rank at most $k$ (it is a sum of $k$ rank-one matrices) and achieves the lower bound, it is a minimizer: $\mathbf{A}_k = \arg\min_{\operatorname{rank}(\mathbf{B}) \leq k} \|\mathbf{A} - \mathbf{B}\|_F.$

Spectral norm case. The spectral (operator) norm $\|\mathbf{M}\|_2 = \sigma_1(\mathbf{M})$ is also unitarily invariant. The subspace intersection argument from Step 3 (applied with a single vector $\mathbf{z}$ ) gives $\|\mathbf{A} - \mathbf{B}\|_2 \geq \sigma_{k+1}$ . Since $\|\mathbf{A} - \mathbf{A}_k\|_2 = \sigma_{k+1}$ (the largest singular value of $\sum_{i=k+1}^r \sigma_i \mathbf{u}_i \mathbf{v}_i^H$ is $\sigma_{k+1}$ ), the bound is tight. $\blacksquare$

SVD Geometry: Rotation–Scaling–Rotation

Visualize how SVD transforms the unit sphere through $\mathbf{V}^H$ (rotate), $\mathbf{\Sigma}$ (scale to ellipsoid), $\mathbf{U}$ (rotate again). At each step the transformed surface is shown, along with the singular vector directions and their corresponding singular values as axis labels.

Parameters

Transformation step

Waiting for data...

Singular Values of a Parameterized Channel Matrix

Explore how singular values change as the channel matrix varies. The condition number $\sigma_1/\sigma_r$ indicates how well-conditioned the channel is. A fully uncorrelated channel ( $\rho = 0$ ) has nearly equal singular values, while a highly correlated channel ( $\rho \to 1$ ) has one dominant singular value and the rest collapse toward zero — resulting in a near-rank-one channel with dramatically reduced spatial multiplexing gain.

Parameters

\rho

(spatial correlation)0

Correlation between antenna elements

n

(antennas)4

Progressive Rank- $k$ Approximation

Watch how adding successive rank-1 terms $\sigma_i \mathbf{u}_i \mathbf{v}_i^H$ progressively reconstructs the original matrix. At each frame $k$ , the approximation $\mathbf{A}_k = \sum_{i=1}^{k} \sigma_i \mathbf{u}_i \mathbf{v}_i^H$ is displayed alongside the Frobenius-norm error $\|\mathbf{A} - \mathbf{A}_k\|_F$ . The Eckart--Young theorem guarantees this is the optimal rank- $k$ approximation.

Parameters

Matrix

Eigendecomposition vs. Singular Value Decomposition

Property	Eigendecomposition	Singular Value Decomposition (SVD)
Applicable to	Square matrices ( $n \times n$ ) only	Any matrix ( $m \times n$ ), square or rectangular
Decomposition form	$\mathbf{A} = \mathbf{P}\mathbf{\Lambda}\mathbf{P}^{-1}$ (if diagonalizable)	$\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H$ (always exists)
Factor structure	$\mathbf{P}$ generally not unitary; $\mathbf{\Lambda}$ complex diagonal	$\mathbf{U}, \mathbf{V}$ both unitary; $\mathbf{\Sigma}$ real non-negative diagonal
Existence	Not guaranteed (defective matrices have no eigendecomposition)	Always exists for every matrix
Spectrum	Eigenvalues $\lambda_i \in \mathbb{C}$ (complex in general)	Singular values $\sigma_i \in \mathbb{R}_{\geq 0}$ (always real, non-negative)
Geometric meaning	Directions scaled without rotation (eigenvectors)	Rotation $\to$ scaling $\to$ rotation (unit sphere $\to$ ellipsoid)
Relation	Eigenvalues of $\mathbf{A}^H\mathbf{A}$ are $\sigma_i^2$	For Hermitian $\mathbf{A} \succeq 0$ : $\sigma_i = \lambda_i$ (eigenvalues = singular values)
Numerical stability	Can be ill-conditioned for non-normal matrices	Always numerically stable (backward-stable algorithms exist)
Low-rank approximation	No direct optimality guarantee	Truncated SVD is optimal (Eckart--Young theorem)
Telecom application	Covariance eigenanalysis, PCA, stability analysis	MIMO channel decomposition, beamforming, channel estimation

Example: SVD of a $2 \times 3$ Matrix

Compute the full SVD of $\mathbf{A} = \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix} \in \mathbb{R}^{2 \times 3}.$

Solution

Step 1: Compute $\mathbf{A}^T \mathbf{A}$

Since $\mathbf{A}$ is real, $\mathbf{A}^H = \mathbf{A}^T$ . Compute: $\mathbf{A}^T \mathbf{A} = \begin{pmatrix} 1 & 0 \\ 1 & 1 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix} = \begin{pmatrix} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 1 \end{pmatrix}.$ This is a $3 \times 3$ real symmetric matrix.

Step 2: Find eigenvalues of $\mathbf{A}^T \mathbf{A}$

The characteristic polynomial is: $\det(\mathbf{A}^T\mathbf{A} - \lambda \mathbf{I}) = \begin{vmatrix} 1-\lambda & 1 & 0 \\ 1 & 2-\lambda & 1 \\ 0 & 1 & 1-\lambda \end{vmatrix}.$ Expanding along the first row: $(1-\lambda)[(2-\lambda)(1-\lambda) - 1] - 1[(1)(1-\lambda) - 0]$ $= (1-\lambda)(\lambda^2 - 3\lambda + 1) - (1-\lambda)$ $= (1-\lambda)(\lambda^2 - 3\lambda + 1 - 1)$ $= (1-\lambda)(\lambda^2 - 3\lambda)$ $= (1-\lambda)\lambda(\lambda - 3)$ $= -\lambda(\lambda - 1)(\lambda - 3).$

The eigenvalues of $\mathbf{A}^T\mathbf{A}$ are $\lambda_1 = 3$ , $\lambda_2 = 1$ , $\lambda_3 = 0$ .

Sanity check: $\operatorname{tr}(\mathbf{A}^T\mathbf{A}) = 1 + 2 + 1 = 4 = 3 + 1 + 0$ . $\checkmark$

The singular values are $\sigma_1 = \sqrt{3}$ and $\sigma_2 = 1$ . The rank is $r = 2$ .

Step 3: Find right singular vectors (eigenvectors of $\mathbf{A}^T \mathbf{A}$)

For $\lambda_1 = 3$ : Solve $(\mathbf{A}^T\mathbf{A} - 3\mathbf{I})\mathbf{v} = \mathbf{0}$ : $\begin{pmatrix} -2 & 1 & 0 \\ 1 & -1 & 1 \\ 0 & 1 & -2 \end{pmatrix}\mathbf{v} = \mathbf{0}.$ From rows 1 and 3: $v_2 = 2v_1$ and $v_2 = 2v_3$ , so $v_1 = v_3$ . Row 2: $v_1 - 2v_1 + v_1 = 0$ . $\checkmark$ $\mathbf{v}_1 = \frac{1}{\sqrt{6}}\begin{pmatrix} 1 \\ 2 \\ 1 \end{pmatrix}.$

For $\lambda_2 = 1$ : Solve $(\mathbf{A}^T\mathbf{A} - \mathbf{I})\mathbf{v} = \mathbf{0}$ : $\begin{pmatrix} 0 & 1 & 0 \\ 1 & 1 & 1 \\ 0 & 1 & 0 \end{pmatrix}\mathbf{v} = \mathbf{0}.$ This gives $v_2 = 0$ and $v_1 = -v_3$ . $\mathbf{v}_2 = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ 0 \\ -1 \end{pmatrix}.$

For $\lambda_3 = 0$ : $\mathbf{v}_3 \in \mathcal{N}(\mathbf{A}^T\mathbf{A}) = \mathcal{N}(\mathbf{A})$ . Solve $\mathbf{A}\mathbf{v} = \mathbf{0}$ : $v_1 + v_2 = 0$ and $v_2 + v_3 = 0$ , so $v_1 = -v_2 = v_3$ . $\mathbf{v}_3 = \frac{1}{\sqrt{3}}\begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix}.$

Verify orthogonality: $\mathbf{v}_1^T \mathbf{v}_2 = \frac{1}{\sqrt{12}}(1 + 0 - 1) = 0$ , $\mathbf{v}_1^T \mathbf{v}_3 = \frac{1}{\sqrt{18}}(1 - 2 + 1) = 0$ , $\mathbf{v}_2^T \mathbf{v}_3 = \frac{1}{\sqrt{6}}(1 + 0 - 1) = 0$ . $\checkmark$

Step 4: Compute left singular vectors

$\mathbf{u}_1 = \frac{1}{\sigma_1}\mathbf{A}\mathbf{v}_1 = \frac{1}{\sqrt{3}} \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix} \frac{1}{\sqrt{6}}\begin{pmatrix} 1 \\ 2 \\ 1 \end{pmatrix} = \frac{1}{\sqrt{18}} \begin{pmatrix} 3 \\ 3 \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}.KATEXPLACEHOLDER0END\mathbf{u}_2 = \frac{1}{\sigma_2}\mathbf{A}\mathbf{v}_2 = \frac{1}{1} \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix} \frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ 0 \\ -1 \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}.$ $Verify orthogonality:$ \mathbf{u}_1^T \mathbf{u}_2 = \frac{1}{2}(1 - 1) = 0 $.$ \checkmark $Since$ m = 2 = r $,$ \mathbf{U}$ is already complete: no extension needed.

Step 5: Assemble the SVD

$\mathbf{U} = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}, \quad \mathbf{\Sigma} = \begin{pmatrix} \sqrt{3} & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix}, \quad \mathbf{V} = \begin{pmatrix} 1/\sqrt{6} & 1/\sqrt{2} & 1/\sqrt{3} \\ 2/\sqrt{6} & 0 & -1/\sqrt{3} \\ 1/\sqrt{6} & -1/\sqrt{2} & 1/\sqrt{3} \end{pmatrix}.KATEXPLACEHOLDER0END\mathbf{U}\mathbf{\Sigma}\mathbf{V}^T = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \begin{pmatrix} \sqrt{3}/\sqrt{6} & 2\sqrt{3}/\sqrt{6} & \sqrt{3}/\sqrt{6} \\ 1/\sqrt{2} & 0 & -1/\sqrt{2} \end{pmatrix}KATEXPLACEHOLDER1END= \frac{1}{\sqrt{2}}\begin{pmatrix} 1/\sqrt{2} + 1/\sqrt{2} & \sqrt{2} + 0 & 1/\sqrt{2} - 1/\sqrt{2} \\ 1/\sqrt{2} - 1/\sqrt{2} & \sqrt{2} - 0 & 1/\sqrt{2} + 1/\sqrt{2} \end{pmatrix}KATEXPLACEHOLDER2END= \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix} = \mathbf{A}. \quad \checkmarkKATEXPLACEHOLDER3END\mathbf{A}_1 = \sqrt{3} \, \mathbf{u}_1 \mathbf{v}_1^T = \sqrt{3} \cdot \frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix} \frac{1}{\sqrt{6}}\begin{pmatrix}1 & 2 & 1\end{pmatrix} = \frac{1}{2}\begin{pmatrix} 1 & 2 & 1 \\ 1 & 2 & 1 \end{pmatrix},$ $with error$ |\mathbf{A} - \mathbf{A}_1|_F = \sigma_2 = 1$.

Historical Note: From Beltrami to Eckart--Young: The History of SVD

1873–1965

The singular value decomposition has a remarkably long and multi-threaded history, with key ideas discovered independently by several mathematicians:

Beltrami (1873). Eugenio Beltrami, an Italian mathematician best known for his work in differential geometry, was the first to consider the decomposition of a bilinear form into canonical form. In his 1873 paper "Sulle funzioni bilineari," he showed that a real bilinear form can be reduced to a sum of products of orthogonal linear forms — essentially the real SVD for square matrices. He identified the singular values as the positive square roots of the eigenvalues of $\mathbf{A}^T\mathbf{A}$ .

Jordan (1874). Just one year later, Camille Jordan independently obtained the same canonical decomposition. Jordan, already famous for the Jordan normal form (1870), approached the problem from the theory of bilinear and quadratic forms. His proof, published in the Journal de math'{e}matiques pures et appliqu'{e}es, used different techniques from Beltrami's but arrived at the same result.

Schmidt (1907). Erhard Schmidt extended the SVD to integral operators (compact operators on function spaces), presaging the modern functional-analytic viewpoint. His work introduced the "Schmidt pairs" (singular vector pairs) and established the convergence of the singular-value expansion.

Eckart and Young (1936). Carl Eckart and Gale Young proved the optimality of the truncated SVD for low-rank approximation. Their 1936 paper "The Approximation of One Matrix by Another of Lower Rank" in Psychometrika showed that the best rank- $k$ approximation (in the Frobenius norm) is obtained by keeping the $k$ largest singular values. This result is now fundamental in data compression, signal processing, and machine learning.

Golub and Kahan (1965). Gene Golub and William Kahan developed the first numerically stable algorithm for computing the SVD, based on bidiagonalization followed by QR-type iterations. This algorithmic breakthrough made the SVD practical for large-scale computation and is the ancestor of the SVD routines in LAPACK and MATLAB.

⚠️Engineering Note

SVD Computation: Cost and Algorithm Choice

The SVD of an $m \times n$ matrix (with $m \geq n$ ) costs $O(mn^2)$ flops using the Golub–Kahan bidiagonalization algorithm (LAPACK's dgesvd). For typical MIMO dimensions:

$4 \times 4$ MIMO: ~256 flops (negligible)
$64 \times 16$ massive MIMO: ~ $10^5$ flops (fast)
$256 \times 64$ XL-MIMO: ~ $10^8$ flops (requires optimization) When only the top $k$ singular values/vectors are needed (e.g., for rank- $k$ channel approximation), use truncated SVD (scipy.sparse.linalg.svds), which costs $O(mnk)$ — a major savings when $k \ll n$ . In real-time MIMO receivers, the full SVD is computed once per coherence interval ( $\sim$ 1 ms in 5G NR at 30 kHz SCS). At $64 \times 16$ dimensions and 1 ms update rate, this is well within the computational budget of a modern baseband processor.

Practical Constraints

•
LAPACK dgesvd: $\sim 4mn^2 + 8n^3$ flops for the full SVD
•
5G NR slot duration at 30 kHz SCS: 0.5 ms (14 OFDM symbols)
•
For real-time: prefer economic SVD ( $\mathbf{U}$ is $m \times n$ , not $m \times m$ ) to save memory

Key Takeaway

The SVD is the decomposition for wireless communications. For any channel matrix $\mathbf{H} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H$ : (1) the singular values $\sigma_i$ are the sub-channel gains; (2) the right singular vectors $\mathbf{v}_i$ are the optimal transmit directions (beamformers); (3) the left singular vectors $\mathbf{u}_i$ are the optimal receive combiners; (4) the number of nonzero singular values equals the spatial multiplexing rank; and (5) the condition number $\sigma_1/\sigma_r$ governs sensitivity to noise. Unlike eigendecomposition, SVD works for any matrix — square or rectangular, Hermitian or not — making it the universally applicable tool. Every capacity formula, every beamforming design, and every channel estimation algorithm in MIMO communications is, at its core, an application of SVD.

Why This Matters: MIMO Channel SVD: Parallel Sub-channels

Consider a narrowband MIMO system with $n_t$ transmit and $n_r$ receive antennas: $\mathbf{y} = \mathbf{H}\mathbf{x} + \mathbf{n},$ where $\mathbf{H} \in \mathbb{C}^{n_r \times n_t}$ is the channel matrix and $\mathbf{n} \sim \mathcal{CN}(\mathbf{0}, \sigma_n^2 \mathbf{I})$ .

Let $\mathbf{H} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^H$ be the SVD with singular values $\sigma_1 \geq \cdots \geq \sigma_r > 0$ and $r = \operatorname{rank}(\mathbf{H}) \leq \min(n_t, n_r)$ .

Precoding and combining. If the transmitter precodes with $\mathbf{V}$ (i.e., transmits $\mathbf{x} = \mathbf{V}\tilde{\mathbf{x}}$ ) and the receiver applies $\mathbf{U}^H$ (i.e., forms $\tilde{\mathbf{y}} = \mathbf{U}^H \mathbf{y}$ ), then: $\tilde{\mathbf{y}} = \mathbf{U}^H \mathbf{H} \mathbf{V} \tilde{\mathbf{x}} + \mathbf{U}^H \mathbf{n} = \mathbf{\Sigma} \tilde{\mathbf{x}} + \tilde{\mathbf{n}},$ where $\tilde{\mathbf{n}} = \mathbf{U}^H \mathbf{n}$ has the same distribution as $\mathbf{n}$ (since $\mathbf{U}$ is unitary and preserves the i.i.d.\ Gaussian distribution).

This decouples into $r$ independent scalar sub-channels: $\tilde{y}_i = \sigma_i \tilde{x}_i + \tilde{n}_i, \qquad i = 1, \ldots, r.$

The capacity with full CSI at both ends is therefore: $C = \sum_{i=1}^{r} \log_2\!\left(1 + \frac{p_i \sigma_i^2}{\sigma_n^2}\right) \quad \text{bits/s/Hz},$ where the power allocation $\{p_i\}$ is determined by water-filling: $p_i = (\mu - \sigma_n^2/\sigma_i^2)^+$ subject to $\sum_i p_i \leq P$ .

Key insight: The SVD simultaneously diagonalizes the channel, orthogonalizes the noise, and reveals the optimal signaling directions. No other decomposition achieves all three.

See full treatment in MIMO Capacity: Deterministic Channels

Common Mistake: Singular Values Are Not Eigenvalues

Mistake:

A common error is to conflate the singular values of a matrix $\mathbf{A}$ with its eigenvalues. Students often write $\sigma_i(\mathbf{A}) = |\lambda_i(\mathbf{A})|$ or assume that the SVD and eigendecomposition produce the same factors.

Correction:

Singular values and eigenvalues are distinct concepts that coincide only in special cases:

1. Different definitions. Eigenvalues satisfy $\mathbf{A}\mathbf{v} = \lambda \mathbf{v}$ (require $\mathbf{A}$ square). Singular values are $\sigma_i = \sqrt{\lambda_i(\mathbf{A}^H\mathbf{A})}$ (work for any matrix).

2. $\sigma_i \neq |\lambda_i|$ in general. For the matrix $\mathbf{A} = \begin{pmatrix} 0 & 2 \\ 0 & 0 \end{pmatrix}$ , both eigenvalues are $0$ , yet $\sigma_1 = 2, \sigma_2 = 0$ .

3. When they do agree.

Hermitian PSD matrices: $\mathbf{A} = \mathbf{A}^H \succeq 0$ implies $\sigma_i = \lambda_i$ (eigenvalues are the singular values).
Hermitian matrices (general): $\sigma_i = |\lambda_i|$ .
Normal matrices: $\sigma_i = |\lambda_i|$ (with matching ordering of $|\lambda_i|$ in decreasing order).
Non-normal matrices: No simple relation. The singular values depend on the interaction between $\mathbf{A}$ and $\mathbf{A}^H$ , not just on the eigenvalues.

4. Practical consequence. In MIMO, the channel gains are the singular values of $\mathbf{H}$ , not its eigenvalues (which may not even exist if $\mathbf{H}$ is rectangular). The eigenvalues of $\mathbf{H}^H\mathbf{H}$ are $\sigma_i^2$ , which is a related but different quantity.

Singular Value

For a matrix $\mathbf{A} \in \mathbb{C}^{m \times n}$ , the $i$ -th singular value $\sigma_i$ is the positive square root of the $i$ -th largest eigenvalue of $\mathbf{A}^H \mathbf{A}$ . Equivalently, $\sigma_i$ is the $i$ -th semi-axis length of the ellipsoid obtained by mapping the unit sphere through $\mathbf{A}$ . Singular values are always real and non-negative, ordered $\sigma_1 \geq \sigma_2 \geq \cdots \geq 0$ .

Frobenius Norm

The Frobenius norm of a matrix $\mathbf{A} \in \mathbb{C}^{m \times n}$ is $\|\mathbf{A}\|_F = \sqrt{\sum_{i=1}^m \sum_{j=1}^n |a_{ij}|^2} = \sqrt{\operatorname{tr}(\mathbf{A}^H \mathbf{A})} = \sqrt{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_r^2},$ where $\sigma_1, \ldots, \sigma_r$ are the singular values. The Frobenius norm is unitarily invariant: $\|\mathbf{U}\mathbf{A}\mathbf{V}\|_F = \|\mathbf{A}\|_F$ for any unitary $\mathbf{U}$ and $\mathbf{V}$ .

Condition Number

The condition number of a matrix $\mathbf{A}$ with respect to the 2-norm is $\kappa(\mathbf{A}) = \frac{\sigma_1}{\sigma_r},$ where $\sigma_1$ is the largest singular value and $\sigma_r$ is the smallest nonzero singular value. A matrix with $\kappa \approx 1$ is well-conditioned; $\kappa \gg 1$ indicates ill-conditioning. In MIMO, the condition number of the channel matrix determines how sensitive zero-forcing receivers are to noise amplification.

Quick Check

Let $\mathbf{A} \in \mathbb{C}^{3 \times 2}$ have singular values $\sigma_1 = 5$ and $\sigma_2 = 3$ . What is $\|\mathbf{A}\|_F$ ?

$8$

$\sqrt{34}$

$5$

$15$

Correction:

\sqrt{34}

$\|\mathbf{A}\|_F = \sqrt{\sigma_1^2 + \sigma_2^2} = \sqrt{25 + 9} = \sqrt{34}$ . The Frobenius norm equals the $\ell_2$ -norm of the singular value vector.

Quick Check

A $4 \times 4$ MIMO channel matrix $\mathbf{H}$ has singular values $\sigma_1 = 3, \sigma_2 = 2, \sigma_3 = 1, \sigma_4 = 0.5$ . What is the Frobenius-norm error of the best rank-2 approximation $\mathbf{H}_2$ ?

$1.5$

$\sqrt{1.25}$

$1$

$\sqrt{14.25}$

Correction:

\sqrt{1.25}

By the Eckart--Young theorem, $\|\mathbf{H} - \mathbf{H}_2\|_F = \sqrt{\sigma_3^2 + \sigma_4^2} = \sqrt{1 + 0.25} = \sqrt{1.25} \approx 1.118$ .

Singular Value Decomposition

Why SVD Is the Backbone of MIMO

Definition: Singular Values and Singular Vectors

Theorem: SVD Existence Theorem

Step 1: Eigendecompose $\mathbf{A}^H\mathbf{A}$

Step 2: Define the singular values and construct $\mathbf{U}$ (first $r$ columns)

Step 3: Extend $\mathbf{U}$ to a full unitary matrix

Step 4: Assemble and verify the decomposition

Step 5: Uniqueness of singular values (and non-uniqueness of singular vectors)

SVD Geometry: Rotation--Scaling--Rotation

SVD in 3D: Unit Sphere to Ellipsoid

Geometric Interpretation: Rotation–Scaling–Rotation

Theorem: Eckart--Young Low-Rank Approximation Theorem

Step 1: Express $\|\mathbf{A} - \mathbf{A}_k\|_F^2$ in terms of singular values

Step 2: Unitary invariance of the Frobenius norm

Step 3: Lower bound via the subspace intersection argument

Step 4: $\mathbf{A}_k$ achieves the lower bound

SVD Geometry: Rotation–Scaling–Rotation

Parameters

Singular Values of a Parameterized Channel Matrix

Parameters

Progressive Rank-kkk Approximation

Parameters

Eigendecomposition vs. Singular Value Decomposition

Example: SVD of a 2×32 \times 32×3 Matrix

Step 1: Compute $\mathbf{A}^T \mathbf{A}$

Step 2: Find eigenvalues of $\mathbf{A}^T \mathbf{A}$

Step 3: Find right singular vectors (eigenvectors of $\mathbf{A}^T \mathbf{A}$)

Step 4: Compute left singular vectors

Step 5: Assemble the SVD

Historical Note: From Beltrami to Eckart--Young: The History of SVD

SVD Computation: Cost and Algorithm Choice

Key Takeaway

Why This Matters: MIMO Channel SVD: Parallel Sub-channels

Common Mistake: Singular Values Are Not Eigenvalues

Singular Value

Frobenius Norm

Condition Number

Quick Check

Quick Check

Definition:
Singular Values and Singular Vectors

Progressive Rank- $k$ Approximation

Example: SVD of a $2 \times 3$ Matrix