Eigenvalue Decomposition

Why Eigenvalues Matter

Eigenvalues and eigenvectors expose the intrinsic geometry of a linear transformation. When you decompose a matrix into its eigenstructure you discover the directions along which the map acts as simple scaling — the natural modes of the system.

In telecommunications this idea is pervasive:

  • Channel covariance. The eigenvalues of the spatial covariance matrix R=E[hhH]\mathbf{R} = \mathbb{E}[\mathbf{h}\mathbf{h}^H] quantify how much energy the channel concentrates along each angular direction.
  • MIMO capacity. The capacity-achieving input covariance is diagonal in the eigenbasis of HHH\mathbf{H}^H \mathbf{H}; the water-filling solution allocates power to eigenmodes whose eigenvalues (channel gains) exceed a threshold.
  • Vibration and stability. Eigenvalues of a system matrix determine whether feedback loops oscillate, converge, or diverge — a concept inherited directly from the theory of differential equations.
  • Principal Component Analysis (PCA). The dominant eigenvectors of a data covariance matrix are the directions of maximum variance, used for dimensionality reduction in signal processing and machine learning.

Understanding eigendecomposition is therefore not a detour; it is the very language in which channel capacity, beamforming optimality, and adaptive filtering are expressed.

Definition:

Eigenvalue and Eigenvector

Let ACn×n\mathbf{A} \in \mathbb{C}^{n \times n}. A scalar λC\lambda \in \mathbb{C} is an eigenvalue of A\mathbf{A} if there exists a nonzero vector vCn\mathbf{v} \in \mathbb{C}^n such that Av=λv.\mathbf{A}\mathbf{v} = \lambda \mathbf{v}. The vector v\mathbf{v} is called an eigenvector associated with λ\lambda. The set of all eigenvalues is the spectrum of A\mathbf{A}, denoted σ(A)\sigma(\mathbf{A}).

Eigenvalues are the roots of the characteristic polynomial p(λ)=det(AλI)=0p(\lambda) = \det(\mathbf{A} - \lambda \mathbf{I}) = 0, which is a degree-nn polynomial in λ\lambda. By the Fundamental Theorem of Algebra, pp has exactly nn roots (counted with multiplicity) in C\mathbb{C}.

Definition:

Characteristic Polynomial

The characteristic polynomial of ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} is p(λ)=det(AλI).p(\lambda) = \det(\mathbf{A} - \lambda \mathbf{I}). It is a monic polynomial of degree nn. Its roots are precisely the eigenvalues of A\mathbf{A}.

The algebraic multiplicity of an eigenvalue λ0\lambda_0 is its multiplicity as a root of p(λ)p(\lambda). The geometric multiplicity is dimN(Aλ0I)\dim \mathcal{N}(\mathbf{A} - \lambda_0 \mathbf{I}), i.e., the dimension of the corresponding eigenspace. One always has 1geometric multiplicityalgebraic multiplicity.1 \leq \text{geometric multiplicity} \leq \text{algebraic multiplicity}.

When the algebraic and geometric multiplicities coincide for every eigenvalue, the matrix is diagonalizable.

Definition:

Eigendecomposition (Diagonalization)

A matrix ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} is diagonalizable if there exists an invertible matrix PCn×n\mathbf{P} \in \mathbb{C}^{n \times n} and a diagonal matrix Λ=diag(λ1,,λn)\mathbf{\Lambda} = \operatorname{diag}(\lambda_1, \ldots, \lambda_n) such that A=PΛP1.\mathbf{A} = \mathbf{P} \mathbf{\Lambda} \mathbf{P}^{-1}. The columns of P\mathbf{P} are eigenvectors of A\mathbf{A}, and the diagonal entries of Λ\mathbf{\Lambda} are the corresponding eigenvalues.

A matrix is diagonalizable if and only if it possesses nn linearly independent eigenvectors.

Not every matrix is diagonalizable. For example, the matrix A=(0100)\mathbf{A} = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} has eigenvalue 00 with algebraic multiplicity 22 but geometric multiplicity 11. However, every Hermitian (or more generally, normal) matrix is diagonalizable, and moreover can be diagonalized by a unitary matrix.

Definition:

Hermitian Matrix

A matrix ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} is Hermitian (or self-adjoint) if AH=A,\mathbf{A}^H = \mathbf{A}, where AH=AT\mathbf{A}^H = \overline{\mathbf{A}}^T denotes the conjugate transpose.

In particular, the diagonal entries of a Hermitian matrix are real, and the off-diagonal entries satisfy aij=ajia_{ij} = \overline{a_{ji}}.

A Hermitian matrix with real entries is called symmetric.

In wireless communications, covariance matrices R=E[xxH]\mathbf{R} = \mathbb{E}[\mathbf{x}\mathbf{x}^H] and Gram matrices HHH\mathbf{H}^H \mathbf{H} are always Hermitian (and positive semidefinite).

Theorem: Eigenvalues of Hermitian Matrices Are Real

Let ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} be Hermitian. Then every eigenvalue of A\mathbf{A} is real.

Theorem: Spectral Theorem for Hermitian Matrices

Every Hermitian matrix ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} can be decomposed as A=QΛQH\mathbf{A} = \mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^H where Q\mathbf{Q} is unitary (QHQ=I\mathbf{Q}^H \mathbf{Q} = \mathbf{I}) and Λ=diag(λ1,,λn)\mathbf{\Lambda} = \operatorname{diag}(\lambda_1, \ldots, \lambda_n) with all λiR\lambda_i \in \mathbb{R}. Equivalently, A=i=1nλiqiqiH\mathbf{A} = \sum_{i=1}^{n} \lambda_i \, \mathbf{q}_i \mathbf{q}_i^H, where {q1,,qn}\{\mathbf{q}_1, \ldots, \mathbf{q}_n\} is an orthonormal basis of eigenvectors.

A Hermitian matrix acts by stretching space along its eigenvectors (which are orthogonal) by real scaling factors (the eigenvalues). It has no "rotational" component. The spectral decomposition separates the geometry (the orthonormal eigenvectors) from the magnitude of the action (the real eigenvalues).

Definition:

Rayleigh Quotient

For a Hermitian matrix ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} and nonzero xCn\mathbf{x} \in \mathbb{C}^n, the Rayleigh quotient is R(A,x)=xHAxxHx.R(\mathbf{A}, \mathbf{x}) = \frac{\mathbf{x}^H \mathbf{A} \mathbf{x}}{\mathbf{x}^H \mathbf{x}}. The Rayleigh quotient is scale-invariant: R(A,αx)=R(A,x)R(\mathbf{A}, \alpha \mathbf{x}) = R(\mathbf{A}, \mathbf{x}) for any α0\alpha \neq 0.

The Rayleigh quotient is bounded between the smallest and largest eigenvalues of A\mathbf{A} (see TExtremal Properties of the Rayleigh Quotient). In wireless communications, SNR expressions such as SNR=wHRsw/wHRnw\text{SNR} = \mathbf{w}^H \mathbf{R}_s \mathbf{w} / \mathbf{w}^H \mathbf{R}_n \mathbf{w} (signal-to-noise ratio after beamforming with weight vector w\mathbf{w}) are generalized Rayleigh quotients. Maximizing the SNR thus reduces to an eigenvalue problem.

Theorem: Extremal Properties of the Rayleigh Quotient

Let ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} be Hermitian with eigenvalues λ1λ2λn\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n and corresponding orthonormal eigenvectors q1,,qn\mathbf{q}_1, \ldots, \mathbf{q}_n. Then for all nonzero xCn\mathbf{x} \in \mathbb{C}^n: λnR(A,x)λ1.\lambda_n \leq R(\mathbf{A}, \mathbf{x}) \leq \lambda_1. Moreover, maxx0R(A,x)=λ1,achieved at x=q1,\max_{\mathbf{x} \neq \mathbf{0}} R(\mathbf{A}, \mathbf{x}) = \lambda_1, \quad \text{achieved at } \mathbf{x} = \mathbf{q}_1, minx0R(A,x)=λn,achieved at x=qn.\min_{\mathbf{x} \neq \mathbf{0}} R(\mathbf{A}, \mathbf{x}) = \lambda_n, \quad \text{achieved at } \mathbf{x} = \mathbf{q}_n.

Since eigenvectors form an orthonormal basis, any vector x\mathbf{x} is a weighted combination of eigenvectors. The Rayleigh quotient computes a weighted average of the eigenvalues, where the weights are the squared magnitudes of the expansion coefficients. A weighted average always lies between the minimum and maximum values being averaged.

Theorem: Courant–Fischer Min-Max Theorem

Let ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} be Hermitian with eigenvalues λ1λ2λn\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n. Then for each k=1,,nk = 1, \ldots, n: λk=maxVCndimV=k  minxVx0R(A,x)=minWCndimW=nk+1  maxxWx0R(A,x).\lambda_k = \max_{\substack{\mathcal{V} \subseteq \mathbb{C}^n \\ \dim \mathcal{V} = k}} \;\min_{\substack{\mathbf{x} \in \mathcal{V} \\ \mathbf{x} \neq \mathbf{0}}} R(\mathbf{A}, \mathbf{x}) = \min_{\substack{\mathcal{W} \subseteq \mathbb{C}^n \\ \dim \mathcal{W} = n - k + 1}} \;\max_{\substack{\mathbf{x} \in \mathcal{W} \\ \mathbf{x} \neq \mathbf{0}}} R(\mathbf{A}, \mathbf{x}).

The kk-th eigenvalue is the "best worst case" of the Rayleigh quotient over all kk-dimensional subspaces. This characterization is extremely powerful because it does not reference eigenvectors — it describes eigenvalues purely through optimization.

Eigenvectors as Fixed Directions of a Linear Map

Visualize how a 2×2 Hermitian matrix transforms vectors. Eigenvectors (shown in red) maintain their direction, only getting scaled by their eigenvalues. The unit circle is mapped to an ellipse whose semi-axes align with the eigenvectors; the lengths of the semi-axes equal the absolute values of the eigenvalues.

Parameters
2

Top-left entry

1

Real part of off-diagonal

0

Imaginary part of off-diagonal

1

Bottom-right entry

Power Iteration Converging to Dominant Eigenvector

Watch how the power iteration xk+1=Axk/Axk\mathbf{x}_{k+1} = \mathbf{A}\mathbf{x}_k / \|\mathbf{A}\mathbf{x}_k\| converges to the dominant eigenvector. At each step, the iterate is shown as a blue arrow; the true dominant eigenvector is shown in red. A bar chart tracks the Rayleigh quotient R(A,xk)R(\mathbf{A}, \mathbf{x}_k) converging to λ1\lambda_1.

Parameters
20
2

Ratio of dominant to subdominant eigenvalue (controls convergence speed)

Example: Eigendecomposition of a 2×2 Hermitian Matrix

Find the eigendecomposition of A=(31j1+j1).\mathbf{A} = \begin{pmatrix} 3 & 1 - j \\ 1 + j & 1 \end{pmatrix}.

Example: Eigendecomposition of a 3×3 Real Symmetric Matrix

Find the eigenvalues of A=(210131012).\mathbf{A} = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{pmatrix}.

Historical Note: From Vibrating Strings to Abstract Algebra: The History of Eigenvalues

1740s–1910

The concept of eigenvalues emerged gradually over two centuries of mathematics:

Euler and d'Alembert (1740s–1750s). The study of coupled oscillations and the rotation of rigid bodies led Euler to consider equations of the form Ax=λx\mathbf{A}\mathbf{x} = \lambda \mathbf{x}. The term "secular equation" (from the Latin saecularis, relating to long-period astronomical perturbations) was used for the characteristic polynomial.

Cauchy (1829). Augustin-Louis Cauchy proved that the eigenvalues of a real symmetric matrix are real, establishing the first rigorous version of what would become the spectral theorem. He also showed that symmetric matrices have orthogonal eigenvectors.

Sylvester (1852). James Joseph Sylvester introduced the term "matrix" (from the Latin for "womb") and developed much of the algebraic machinery for determinants and characteristic polynomials.

Hilbert (1904–1910). David Hilbert extended the spectral theorem to infinite-dimensional spaces (integral operators), founding the field of functional analysis. His spectral theory is the mathematical backbone of quantum mechanics and modern signal processing.

The word "Eigenwert" (own value) was coined by Hilbert in 1904. The English hybrid "eigenvalue" preserves the German prefix, a testament to the concept's Germanic origins.

Common Mistake: Common Misconceptions About Eigendecomposition

Mistake:

Several common misconceptions about eigendecomposition can lead to errors in analysis and computation.

Correction:

1. Eigendecomposition requires a square matrix. Only square matrices (n×nn \times n) have eigenvalues. For rectangular matrices, use the singular value decomposition (SVD) instead (see §Singular Value Decomposition).

2. Eigenvectors are not always orthogonal. For a general (non-Hermitian) matrix, eigenvectors corresponding to distinct eigenvalues need not be orthogonal. Orthogonality is guaranteed only for normal matrices (AHA=AAH\mathbf{A}^H \mathbf{A} = \mathbf{A}\mathbf{A}^H), of which Hermitian matrices are a special case.

3. Not every matrix is diagonalizable. Defective matrices (where geometric multiplicity < algebraic multiplicity for some eigenvalue) cannot be diagonalized. The Jordan normal form handles such matrices but is numerically unstable and rarely used in practice.

4. Eigenvalues of a product \neq products of eigenvalues (in general). σ(AB){λi(A)λi(B)}\sigma(\mathbf{A}\mathbf{B}) \neq \{\lambda_i(\mathbf{A})\lambda_i(\mathbf{B})\} unless A\mathbf{A} and B\mathbf{B} share a common eigenbasis (e.g., when they commute and are both diagonalizable).

5. Numerical eigenvalue computation never uses det(AλI)=0\det(\mathbf{A} - \lambda \mathbf{I}) = 0. Finding roots of the characteristic polynomial is numerically unstable. Practical algorithms (QR iteration, divide-and-conquer) work directly with the matrix and achieve backward-stable results.

🔧Engineering Note

Practical Eigenvalue Algorithms and Their Costs

Never compute eigenvalues via the characteristic polynomial — it is numerically catastrophic even for 4×44 \times 4 matrices. Production algorithms:

  • QR iteration (general matrices): O(n3)O(n^3) per step, converges in O(n)O(n) steps. Total: O(n4)O(n^4) worst case, O(n3)O(n^3) typical.
  • Divide-and-conquer (symmetric/Hermitian): O(n2.3)O(n^{2.3}) average, fastest for moderate nn. Used by numpy.linalg.eigh.
  • Lanczos/Arnoldi (large sparse): O(knnz)O(k \cdot \text{nnz}) for the top kk eigenvalues, where nnz\text{nnz} is the number of nonzero entries. Used by scipy.sparse.linalg.eigsh. For MIMO covariance matrices (n256n \leq 256), numpy.linalg.eigh (LAPACK's divide-and-conquer) is optimal. For massive MIMO spatial correlation matrices (n>1000n > 1000), use Lanczos.
Practical Constraints
  • LAPACK dsyevd (divide-and-conquer): ~4n^3 flops for all eigenvalues

  • For channel covariance estimation in 5G NR: nt=32n_t = 32 → eigendecomposition costs ~130K flops (negligible)

  • For XL-MIMO with nt=1024n_t = 1024: eigendecomposition costs ~4G flops (requires optimization)

🔧Engineering Note

Power Iteration: When Simple Beats Sophisticated

The power iteration converges to the dominant eigenvector at rate λ2/λ1k|\lambda_2/\lambda_1|^k per iteration. Each iteration costs O(n2)O(n^2) (one matrix-vector product). Despite its simplicity:

  • It is the method of choice when only the largest eigenvalue is needed (e.g., spectral radius for stability analysis, dominant singular value for condition estimation).
  • Google's PageRank is essentially power iteration on a 1010×101010^{10} \times 10^{10} sparse matrix.
  • In wireless: the dominant eigenvector of HHH\mathbf{H}^H\mathbf{H} gives the optimal beamforming direction. Power iteration computes it with O(nt2)O(n_t^2) per step — much cheaper than full eigendecomposition. Convergence is slow when λ2/λ11|\lambda_2/\lambda_1| \approx 1. In this case, use inverse iteration (converges as λ1/λ2k|\lambda_1/\lambda_2|^k for the smallest eigenvalue) or Rayleigh quotient iteration (cubic convergence).
Practical Constraints
  • Power iteration per step: one matrix-vector product (O(n2)O(n^2) dense, O(nnz)O(\text{nnz}) sparse)

  • Typical convergence: 10–50 iterations for λ2/λ1<0.9|\lambda_2/\lambda_1| < 0.9

Key Takeaway

A Hermitian matrix is nothing more than scaling along orthogonal directions: A=QΛQH\mathbf{A} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^H says that A\mathbf{A} stretches each eigenvector qi\mathbf{q}_i by the real factor λi\lambda_i and does nothing else.

Implications for computation: Matrix powers become trivial (Ak=QΛkQH\mathbf{A}^k = \mathbf{Q}\mathbf{\Lambda}^k \mathbf{Q}^H), the inverse is A1=QΛ1QH\mathbf{A}^{-1} = \mathbf{Q}\mathbf{\Lambda}^{-1}\mathbf{Q}^H (when it exists), and any analytic function f(A)f(\mathbf{A}) is simply Qdiag(f(λ1),,f(λn))QH\mathbf{Q}\operatorname{diag}(f(\lambda_1), \ldots, f(\lambda_n))\mathbf{Q}^H.

Why This Matters: Eigenvalues and MIMO Channel Capacity

Consider a narrowband MIMO channel with ntn_t transmit and nrn_r receive antennas, modeled as y=Hx+n\mathbf{y} = \mathbf{H}\mathbf{x} + \mathbf{n}, where HCnr×nt\mathbf{H} \in \mathbb{C}^{n_r \times n_t} is the channel matrix and nCN(0,σ2I)\mathbf{n} \sim \mathcal{CN}(\mathbf{0}, \sigma^2 \mathbf{I}).

When channel state information is available at the transmitter, the capacity is C=maxRx0tr(Rx)Plog2det ⁣(I+1σ2HRxHH).C = \max_{\substack{\mathbf{R}_x \succeq 0 \\ \operatorname{tr}(\mathbf{R}_x) \leq P}} \log_2 \det\!\left(\mathbf{I} + \frac{1}{\sigma^2}\mathbf{H}\mathbf{R}_x \mathbf{H}^H\right). The Hermitian matrix HHH\mathbf{H}^H \mathbf{H} has eigendecomposition HHH=VΛVH\mathbf{H}^H \mathbf{H} = \mathbf{V}\mathbf{\Lambda}\mathbf{V}^H, where λ1λnt0\lambda_1 \geq \cdots \geq \lambda_{n_t} \geq 0. The optimal input covariance is Rx=Vdiag(p1,,pnt)VH\mathbf{R}_x^{\star} = \mathbf{V} \operatorname{diag}(p_1, \ldots, p_{n_t}) \mathbf{V}^H, and the capacity simplifies to C=i=1rlog2 ⁣(1+piλiσ2),C = \sum_{i=1}^{r} \log_2\!\left(1 + \frac{p_i \lambda_i}{\sigma^2}\right), where r=rank(H)r = \operatorname{rank}(\mathbf{H}) and the powers {pi}\{p_i\} are chosen by water-filling: pi=(μσ2λi)+p_i = \left(\mu - \frac{\sigma^2}{\lambda_i}\right)^+ with μ\mu set to satisfy the power constraint ipi=P\sum_i p_i = P.

Key insight: The eigenvalues λi\lambda_i of HHH\mathbf{H}^H\mathbf{H} determine the channel gains of the independent spatial modes. Large eigenvalues correspond to strong modes that receive more power; weak modes (small λi\lambda_i) may be shut off entirely. The eigendecomposition transforms the coupled MIMO channel into a set of parallel, independent scalar channels — the conceptual simplification that makes MIMO tractable.

Quick Check

Let A=(42j2j4)\mathbf{A} = \begin{pmatrix} 4 & 2j \\ -2j & 4 \end{pmatrix}. What are the eigenvalues of A\mathbf{A}?

λ1=6,  λ2=2\lambda_1 = 6, \; \lambda_2 = 2

λ1=4+2j,  λ2=42j\lambda_1 = 4 + 2j, \; \lambda_2 = 4 - 2j

λ1=4,  λ2=4\lambda_1 = 4, \; \lambda_2 = 4

λ1=8,  λ2=0\lambda_1 = 8, \; \lambda_2 = 0

Quick Check

A 3×33 \times 3 Hermitian matrix has eigenvalues 5,3,15, 3, 1. What is the maximum value of R(A,x)R(\mathbf{A}, \mathbf{x}) over all nonzero x\mathbf{x}?

11

33

55

99

Quick Check

Which of the following matrices is guaranteed to be diagonalizable by a unitary matrix?

Any invertible matrix

Any upper triangular matrix

Any Hermitian matrix

Any matrix with distinct eigenvalues

Quick Check

If AC4×4\mathbf{A} \in \mathbb{C}^{4 \times 4} has eigenvalues 3,1,1,23, 1, -1, -2, what is tr(A2)\operatorname{tr}(\mathbf{A}^2)?

3

5

15

19

Eigenvalue

A scalar λ\lambda such that Av=λv\mathbf{A}\mathbf{v} = \lambda\mathbf{v} for some nonzero vector v\mathbf{v}. Equivalently, a root of the characteristic polynomial det(AλI)=0\det(\mathbf{A} - \lambda \mathbf{I}) = 0. For Hermitian matrices, all eigenvalues are real.

Spectral Theorem

The result stating that every Hermitian matrix ACn×n\mathbf{A} \in \mathbb{C}^{n \times n} admits a decomposition A=QΛQH\mathbf{A} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^H where Q\mathbf{Q} is unitary and Λ\mathbf{\Lambda} is real diagonal. This guarantees the existence of an orthonormal eigenbasis with real eigenvalues.

Rayleigh Quotient

For a Hermitian matrix A\mathbf{A} and nonzero vector x\mathbf{x}, the ratio R(A,x)=xHAx/xHxR(\mathbf{A}, \mathbf{x}) = \mathbf{x}^H \mathbf{A} \mathbf{x} / \mathbf{x}^H \mathbf{x}. It is bounded by the extreme eigenvalues of A\mathbf{A} and equals an eigenvalue when x\mathbf{x} is a corresponding eigenvector.