Least Squares and Pseudoinverse
When There Is No Exact Solution
Most real-world systems are overdetermined (): more equations than unknowns. The channel estimation equation typically has more pilot observations than channel taps. Least squares finds the that minimizes β the foundation of estimation theory in signal processing.
Definition: Ordinary Least Squares (OLS)
Ordinary Least Squares (OLS)
Given an overdetermined system with , , the least-squares solution minimizes the residual:
The solution satisfies the normal equations:
x_hat, residuals, rank, sv = np.linalg.lstsq(A, b, rcond=None)
lstsq returns:
x_hat: the least-squares solutionresiduals: (if )rank: effective rank ofsv: singular values of
Definition: Moore-Penrose Pseudoinverse
Moore-Penrose Pseudoinverse
The Moore-Penrose pseudoinverse generalizes the inverse to rectangular and rank-deficient matrices. If , then:
where for and otherwise.
A_pinv = np.linalg.pinv(A)
x_hat = A_pinv @ b # equivalent to lstsq
For full-rank :
- : (left inverse)
- : (right inverse)
Definition: Tikhonov (Ridge) Regularization
Tikhonov (Ridge) Regularization
When is ill-conditioned, OLS amplifies noise. Tikhonov regularization adds a penalty:
The closed-form solution is:
alpha = 0.1
x_tik = np.linalg.solve(A.conj().T @ A + alpha * np.eye(n),
A.conj().T @ b)
This is exactly the MMSE estimator in wireless: with , Tikhonov regularization gives the minimum mean-squared error (MMSE) estimate.
In machine learning, this is called ridge regression. The parameter trades bias for variance.
Definition: QR Factorization for Least Squares
QR Factorization for Least Squares
The QR factorization where is unitary and is upper triangular provides a numerically stable way to solve least squares:
from scipy.linalg import qr, solve_triangular
Q, R = qr(A, mode='economic') # reduced QR
x_qr = solve_triangular(R, Q.conj().T @ b)
QR is more numerically stable than the normal equations because it avoids forming (which squares the condition number).
Definition: Total Least Squares (TLS)
Total Least Squares (TLS)
When both and are noisy, ordinary least squares is biased. Total least squares minimizes:
The solution uses the SVD of :
C = np.column_stack([A, b])
U, s, Vh = np.linalg.svd(C, full_matrices=False)
V = Vh.conj().T
x_tls = -V[:n, -1] / V[-1, -1]
Theorem: Normal Equations and the Least-Squares Solution
The least-squares solution of satisfies:
When has full column rank (), the solution is unique:
Derivation
Expand the objective . Setting the gradient to zero: , which gives the normal equations.
Theorem: QR Is More Stable Than Normal Equations
Solving least squares via the normal equations squares the condition number: . The QR-based approach operates with , preserving roughly twice as many digits of accuracy.
Forming explicitly loses information through cancellation. QR factorization avoids this by working directly with the orthogonal factor .
Example: Least-Squares Channel Estimation
Estimate a wireless channel with taps from pilot observations using least squares. Compare OLS with Tikhonov regularization at different noise levels.
Setup and estimate
import numpy as np
L = 5 # channel taps
M = 20 # pilot symbols
np.random.seed(42)
# True channel
h_true = np.array([0.8, 0.5+0.3j, -0.2+0.1j, 0.1, -0.05])
# Pilot matrix (random QPSK)
X = (np.random.choice([-1, 1], (M, L))
+ 1j * np.random.choice([-1, 1], (M, L))) / np.sqrt(2)
for snr_db in [5, 15, 30]:
snr_lin = 10 ** (snr_db / 10)
noise_var = np.linalg.norm(h_true)**2 / snr_lin
n = np.sqrt(noise_var/2) * (np.random.randn(M) + 1j*np.random.randn(M))
y = X @ h_true + n
# OLS
h_ols = np.linalg.lstsq(X, y, rcond=None)[0]
# Tikhonov (MMSE)
alpha = noise_var
h_tik = np.linalg.solve(X.conj().T @ X + alpha * np.eye(L),
X.conj().T @ y)
print(f"SNR={snr_db:2d} dB: "
f"OLS MSE={np.mean(np.abs(h_ols-h_true)**2):.2e}, "
f"Tik MSE={np.mean(np.abs(h_tik-h_true)**2):.2e}")
Interpretation
At low SNR, Tikhonov (MMSE) significantly outperforms OLS because it regularizes against noise amplification. At high SNR, both converge since the bias from regularization becomes the dominant error.
Example: Least Squares via QR Factorization
Solve an overdetermined system using QR factorization and compare the accuracy with the normal equations approach for an ill-conditioned matrix.
QR vs Normal Equations
import numpy as np
from scipy.linalg import qr, solve_triangular
m, n = 100, 10
np.random.seed(42)
# Create ill-conditioned A (condition number ~1e8)
U, _, Vh = np.linalg.svd(np.random.randn(m, n), full_matrices=False)
s = np.logspace(0, -8, n) # singular values from 1 to 1e-8
A = U * s @ Vh
x_true = np.random.randn(n)
b = A @ x_true
# Method 1: Normal equations
x_normal = np.linalg.solve(A.T @ A, A.T @ b)
# Method 2: QR factorization
Q, R = qr(A, mode='economic')
x_qr = solve_triangular(R, Q.T @ b)
# Method 3: lstsq (uses SVD internally)
x_lstsq = np.linalg.lstsq(A, b, rcond=None)[0]
print(f"Condition number: {np.linalg.cond(A):.2e}")
print(f"Normal eq error: {np.linalg.norm(x_normal - x_true):.2e}")
print(f"QR error: {np.linalg.norm(x_qr - x_true):.2e}")
print(f"lstsq error: {np.linalg.norm(x_lstsq - x_true):.2e}")
Example: Total Least Squares for Errors-in-Variables
Compare OLS and TLS when both the matrix and observation have noise.
OLS vs TLS
import numpy as np
m, n = 50, 3
np.random.seed(42)
A_true = np.random.randn(m, n)
x_true = np.array([1.0, -2.0, 0.5])
b_true = A_true @ x_true
noise_level = 0.1
A_noisy = A_true + noise_level * np.random.randn(m, n)
b_noisy = b_true + noise_level * np.random.randn(m)
# OLS (ignores noise in A)
x_ols = np.linalg.lstsq(A_noisy, b_noisy, rcond=None)[0]
# TLS (accounts for noise in both A and b)
C = np.column_stack([A_noisy, b_noisy])
_, _, Vh = np.linalg.svd(C, full_matrices=False)
V = Vh.conj().T
x_tls = -V[:n, -1] / V[-1, -1]
print(f"True x: {x_true}")
print(f"OLS x: {x_ols}")
print(f"TLS x: {x_tls}")
print(f"OLS error: {np.linalg.norm(x_ols - x_true):.4f}")
print(f"TLS error: {np.linalg.norm(x_tls - x_true):.4f}")
Least Squares: OLS vs Tikhonov Regularization
Compare ordinary least squares with Tikhonov regularization. Adjust the noise level and regularization parameter to see how they affect the estimation error.
Parameters
Least Squares Methods Compared
| Method | Solves | Stability | Cost | When to Use |
|---|---|---|---|---|
| Normal Equations | Squares | Well-conditioned, small | ||
| QR Factorization | Preserves | Default choice for dense | ||
| SVD (lstsq) | Best stability | Rank-deficient systems | ||
| Tikhonov | Regularized | Ill-conditioned, noisy | ||
| Total LS | SVD of | Best when both noisy | Errors in variables |
least squares
The problem of finding that minimizes . The solution is .
Related: Moore-Penrose pseudoinverse
Moore-Penrose pseudoinverse
The unique matrix satisfying four conditions (Penrose conditions). Generalizes the inverse to rectangular and rank-deficient matrices. Computed via SVD.
Related: least squares
Common Mistake: Normal Equations for Ill-Conditioned Systems
Mistake:
Solving least squares via x = np.linalg.solve(A.T @ A, A.T @ b)
for an ill-conditioned matrix, which squares the condition number.
Correction:
Use np.linalg.lstsq(A, b) or QR factorization. The lstsq
function uses SVD internally and handles rank deficiency gracefully.
Quick Check
In wireless communications, Tikhonov regularization with corresponds to which estimator?
MMSE (Minimum Mean Squared Error)
Zero-Forcing (ZF)
Maximum Likelihood (ML)
Matched Filter
The MMSE estimator adds noise variance as regularization, trading bias for variance reduction.
Key Takeaway
Use np.linalg.lstsq as your default solver for overdetermined
systems. For ill-conditioned problems, add Tikhonov regularization.
For errors-in-variables, use total least squares. Never form the
normal equations directly unless the
system is well-conditioned and speed is critical.
Why This Matters: MMSE Channel Estimation Is Tikhonov Regularization
The MMSE channel estimate is:
This is exactly Tikhonov regularization with . At high SNR (), MMSE ZF (OLS). At low SNR, the regularization prevents noise amplification, giving superior estimation accuracy.
Least Squares Methods
# Code from: ch06/python/least_squares.py
# Load from backend supplements endpoint