Matrix Functions and Special Matrices

Beyond Scalar Functions: Applying Functions to Matrices

A matrix function f(A)f(\mathbf{A}) is not simply applying ff element-wise. The matrix exponential eAe^{\mathbf{A}}, for example, solves the ODE xห™=Ax\dot{\mathbf{x}} = \mathbf{A}\mathbf{x} and arises in control theory, quantum mechanics, and the analysis of Markov chains. Special structured matrices (Toeplitz, circulant, Hankel) appear in signal processing because convolution is a Toeplitz matrix-vector product and can be diagonalized by the DFT.

Definition:

Matrix Exponential

The matrix exponential of AโˆˆCnร—n\mathbf{A} \in \mathbb{C}^{n \times n} is:

eA=โˆ‘k=0โˆžAkk!=I+A+A22!+A33!+โ‹ฏe^{\mathbf{A}} = \sum_{k=0}^{\infty} \frac{\mathbf{A}^k}{k!} = \mathbf{I} + \mathbf{A} + \frac{\mathbf{A}^2}{2!} + \frac{\mathbf{A}^3}{3!} + \cdots

from scipy.linalg import expm
E = expm(A)

SciPy uses the Pade approximation with scaling and squaring (the Al-Mohy & Higham algorithm), which is numerically stable.

Key property: eAeB=eA+Be^{\mathbf{A}} e^{\mathbf{B}} = e^{\mathbf{A}+\mathbf{B}} only when AB=BA\mathbf{A}\mathbf{B} = \mathbf{B}\mathbf{A} (i.e., A\mathbf{A} and B\mathbf{B} commute).

Do not use np.exp(A) for the matrix exponential โ€” that computes the element-wise exponential [expโก(Aij)][\exp(A_{ij})], which is completely different.

Definition:

Matrix Logarithm and Square Root

The matrix logarithm logโก(A)\log(\mathbf{A}) satisfies elogโก(A)=Ae^{\log(\mathbf{A})} = \mathbf{A} (when A\mathbf{A} has no nonpositive real eigenvalues).

The matrix square root A1/2\mathbf{A}^{1/2} satisfies A1/2A1/2=A\mathbf{A}^{1/2}\mathbf{A}^{1/2} = \mathbf{A} (for positive definite A\mathbf{A}).

from scipy.linalg import logm, sqrtm

L = logm(A)           # matrix logarithm
S = sqrtm(A)          # principal matrix square root

The matrix square root of a covariance matrix R\mathbf{R} is used to generate correlated random vectors: x=R1/2z\mathbf{x} = \mathbf{R}^{1/2}\mathbf{z} where zโˆผCN(0,I)\mathbf{z} \sim \mathcal{CN}(\mathbf{0}, \mathbf{I}).

Definition:

Toeplitz Matrix

A Toeplitz matrix has constant diagonals: Tij=tiโˆ’jT_{ij} = t_{i-j}

T=[t0tโˆ’1tโˆ’2โ‹ฏt1t0tโˆ’1โ‹ฏt2t1t0โ‹ฏโ‹ฎโ‹ฎโ‹ฎโ‹ฑ]\mathbf{T} = \begin{bmatrix} t_0 & t_{-1} & t_{-2} & \cdots \\ t_1 & t_0 & t_{-1} & \cdots \\ t_2 & t_1 & t_0 & \cdots \\ \vdots & \vdots & \vdots & \ddots \end{bmatrix}

from scipy.linalg import toeplitz

c = [2, 1, 0, 0]     # first column
r = [2, -1, 0, 0]    # first row
T = toeplitz(c, r)

Toeplitz matrices represent linear time-invariant (LTI) filters: the output of a discrete convolution y[n]=โˆ‘kh[k]x[nโˆ’k]y[n] = \sum_k h[k] x[n-k] can be written as y=Thx\mathbf{y} = \mathbf{T}_h \mathbf{x} where Th\mathbf{T}_h is Toeplitz with the filter coefficients.

Definition:

Circulant Matrix and the DFT

A circulant matrix is a Toeplitz matrix where each row is a cyclic shift of the previous row:

C=[c0cnโˆ’1cnโˆ’2โ‹ฏc1c1c0cnโˆ’1โ‹ฏc2c2c1c0โ‹ฏc3โ‹ฎโ‹ฑโ‹ฎcnโˆ’1cnโˆ’2cnโˆ’3โ‹ฏc0]\mathbf{C} = \begin{bmatrix} c_0 & c_{n-1} & c_{n-2} & \cdots & c_1 \\ c_1 & c_0 & c_{n-1} & \cdots & c_2 \\ c_2 & c_1 & c_0 & \cdots & c_3 \\ \vdots & & & \ddots & \vdots \\ c_{n-1} & c_{n-2} & c_{n-3} & \cdots & c_0 \end{bmatrix}

Every circulant matrix is diagonalized by the DFT matrix: C=FHdiag(Fc)F\mathbf{C} = \mathbf{F}^H \mathrm{diag}(\mathbf{F}\mathbf{c}) \mathbf{F}

This is why circular convolution can be computed via FFT in O(nlogโกn)O(n \log n) instead of O(n2)O(n^2).

from scipy.linalg import circulant
C = circulant([1, 2, 3, 4])

Definition:

DFT Matrix

The nร—nn \times n DFT matrix F\mathbf{F} has entries:

Fkl=1neโˆ’j2ฯ€kl/n,k,l=0,โ€ฆ,nโˆ’1F_{kl} = \frac{1}{\sqrt{n}} e^{-j 2\pi kl/n}, \quad k, l = 0, \ldots, n-1

The DFT matrix is unitary: FFH=I\mathbf{F}\mathbf{F}^H = \mathbf{I}.

from scipy.linalg import dft

F = dft(n, scale='sqrtn')   # unitary DFT matrix
# Or construct manually:
k = np.arange(n)
F_manual = np.exp(-1j * 2 * np.pi * np.outer(k, k) / n) / np.sqrt(n)

In practice, you never form F\mathbf{F} explicitly โ€” the FFT computes Fx\mathbf{F}\mathbf{x} in O(nlogโกn)O(n \log n) without the matrix:

X = np.fft.fft(x) / np.sqrt(n)   # equivalent to F @ x

OFDM modulation is literally the inverse DFT: xtime=FHxfreq\mathbf{x}_{\mathrm{time}} = \mathbf{F}^H \mathbf{x}_{\mathrm{freq}}, computed via IFFT.

Theorem: Circulant Matrices Are Diagonalized by the DFT

Any circulant matrix C\mathbf{C} with first column c\mathbf{c} can be written as: C=FHdiag(nโ€‰Fc)โ€‰F\mathbf{C} = \mathbf{F}^H \mathrm{diag}(\sqrt{n}\,\mathbf{F}\mathbf{c})\,\mathbf{F}

Equivalently, the eigenvalues of C\mathbf{C} are ฮปk=nโ€‰(Fc)k\lambda_k = \sqrt{n}\,(\mathbf{F}\mathbf{c})_k and the eigenvectors are the columns of FH\mathbf{F}^H.

Circular convolution in the time domain equals pointwise multiplication in the frequency domain. The DFT diagonalizes circular convolution. This is the mathematical foundation of OFDM: the cyclic prefix makes the channel convolution circular, so the channel can be inverted by per-subcarrier scalar division.

Theorem: Properties of the Matrix Exponential

For A,BโˆˆCnร—n\mathbf{A}, \mathbf{B} \in \mathbb{C}^{n \times n}:

  1. e0=Ie^{\mathbf{0}} = \mathbf{I}
  2. (eA)โˆ’1=eโˆ’A(e^{\mathbf{A}})^{-1} = e^{-\mathbf{A}}
  3. e(ฮฑ+ฮฒ)A=eฮฑAโ€‰eฮฒAe^{(\alpha + \beta)\mathbf{A}} = e^{\alpha\mathbf{A}}\,e^{\beta\mathbf{A}}
  4. If A=Vฮ›Vโˆ’1\mathbf{A} = \mathbf{V}\boldsymbol{\Lambda}\mathbf{V}^{-1}, then eA=Vโ€‰eฮ›โ€‰Vโˆ’1e^{\mathbf{A}} = \mathbf{V}\,e^{\boldsymbol{\Lambda}}\,\mathbf{V}^{-1}
  5. detโก(eA)=etr(A)\det(e^{\mathbf{A}}) = e^{\mathrm{tr}(\mathbf{A})}
  6. eAeB=eA+Be^{\mathbf{A}} e^{\mathbf{B}} = e^{\mathbf{A}+\mathbf{B}} only if AB=BA\mathbf{AB} = \mathbf{BA}

Example: State Transition via Matrix Exponential

A continuous-time linear system has state equation xห™=Ax\dot{\mathbf{x}} = \mathbf{A}\mathbf{x} with A=[01โˆ’2โˆ’3]\mathbf{A} = \begin{bmatrix} 0 & 1 \\ -2 & -3 \end{bmatrix}. Compute the state transition matrix ฮฆ(t)=eAt\Phi(t) = e^{\mathbf{A}t} at t=0,0.5,1.0,2.0t = 0, 0.5, 1.0, 2.0 and verify that ฮฆ(0)=I\Phi(0) = \mathbf{I}.

Example: Convolution as Toeplitz Matrix-Vector Product

Show that filtering a signal x\mathbf{x} with filter h\mathbf{h} is equivalent to multiplying by a Toeplitz matrix Th\mathbf{T}_h.

Example: Circular Convolution via DFT Diagonalization

Verify that multiplying by a circulant matrix is equivalent to circular convolution, which can be computed via FFT.

Toeplitz matrix

A matrix with constant diagonals: Tij=tiโˆ’jT_{ij} = t_{i-j}. Represents linear time-invariant filtering (convolution).

Related: circulant matrix

circulant matrix

A Toeplitz matrix where each row is a cyclic shift of the previous. Diagonalized by the DFT matrix. Represents circular convolution.

Related: Toeplitz matrix

matrix exponential

The function eA=โˆ‘k=0โˆžAk/k!e^{\mathbf{A}} = \sum_{k=0}^\infty \mathbf{A}^k/k!. Solves the linear ODE xห™=Ax\dot{\mathbf{x}} = \mathbf{A}\mathbf{x} and is computed via Pade approximation in scipy.linalg.expm.

Common Mistake: np.exp vs scipy.linalg.expm

Mistake:

Using np.exp(A) thinking it computes the matrix exponential eAe^{\mathbf{A}}. np.exp applies the exponential element-wise, which is mathematically different.

Correction:

Use scipy.linalg.expm(A) for the matrix exponential. The two are equal only for diagonal matrices.

Quick Check

What diagonalizes every circulant matrix?

The DFT matrix

The identity matrix

The Hadamard matrix

Its own eigenvector matrix (varies per matrix)

Key Takeaway

Exploit matrix structure. Toeplitz โ†’\to FFT-based fast multiply. Circulant โ†’\to DFT diagonalization. Knowing the structure of your matrix is often more valuable than knowing the latest algorithm. In OFDM, the cyclic prefix makes the channel Toeplitz structure effectively circulant, enabling per-subcarrier equalization.

Matrix Functions and Special Matrices

python
Matrix exponential, logarithm, square root, Toeplitz, circulant, Hankel, and DFT matrix construction with practical examples.
# Code from: ch06/python/matrix_functions.py
# Load from backend supplements endpoint