Chapter Summary
Chapter Summary
Key Points
- 1.
Always solve, never invert. Use
np.linalg.solve(A, b)instead ofnp.linalg.inv(A) @ b. The solve approach is 3x faster, uses half the memory, and produces smaller numerical errors. Useslogdetfor determinants of large matrices to avoid overflow. - 2.
Match the decomposition to the matrix structure. Use
eighfor Hermitian matrices (covariance, correlation, Gram matrices) β it is 3x faster thaneigand guarantees real eigenvalues. Use SVD for any matrix when you need rank, condition number, pseudoinverse, or low-rank approximation. The Eckart-Young theorem guarantees SVD gives the optimal low-rank approximation. - 3.
Build sparse in COO, compute in CSR. For matrices where most entries are zero, sparse formats reduce memory from to . Use
scipy.sparse.diagsfor banded matrices,spsolvefor sparse linear systems, andeigsh/svdsfor finding a few eigenvalues of large sparse matrices. Never convert sparse to dense just to use NumPy functions. - 4.
Never form the full Kronecker product. The identity converts an operation into two matrix multiplies. This pattern appears in the Kronecker MIMO channel model, 2D filtering, and multidimensional transforms.
- 5.
Exploit matrix structure for speed. Toeplitz matrices (convolution) can be applied in via FFT. Circulant matrices are diagonalized by the DFT β the mathematical foundation of OFDM. Use
scipy.linalg.expm(notnp.exp) for the matrix exponential. Distinguish element-wise operations from matrix functions. - 6.
Least squares is the bridge between linear algebra and estimation theory. Use
np.linalg.lstsqas the default for overdetermined systems. Add Tikhonov regularization () for noisy, ill-conditioned problems β this is exactly the MMSE estimator. Use QR factorization instead of normal equations for numerical stability. Use total least squares when both sides of the equation are noisy.
Looking Ahead
Chapter 7 applies these linear algebra tools to optimization: gradient descent, Newton's method, and convex optimization with SciPy's minimize interface. The connection is direct β the Hessian matrix in Newton's method requires solving a linear system at every step, and the condition number of the Hessian determines convergence speed. Regularization from Section 6.6 reappears as penalty terms in regularized optimization.