Ferkans — Interactive Telecom Tutor

Root Finding: Where Optimization Meets Equations

Finding where $f(\mathbf{x}) = \mathbf{0}$ is closely related to optimization: the first-order optimality condition $\nabla f = \mathbf{0}$ is itself a root-finding problem. SciPy provides fsolve, root, and brentq for different scenarios. Fixed-point iteration $\mathbf{x}_{k+1} = g(\mathbf{x}_k)$ is a complementary framework that appears in iterative algorithms throughout scientific computing.

Definition:
Root-Finding Problem

Given $f : \mathbb{R}^n \to \mathbb{R}^n$ , find $\mathbf{x}^\star$ such that:

$f(\mathbf{x}^\star) = \mathbf{0}$

For $n=1$ , this is finding the zeros of a scalar function. For $n > 1$ , this is solving a system of nonlinear equations.

from scipy.optimize import fsolve, root, brentq

# Scalar: brentq (bracketing, guaranteed for continuous f)
x_star = brentq(f, a, b)  # f(a) and f(b) must have opposite signs

# System: fsolve (Levenberg-Marquardt)
x_star = fsolve(func, x0)

# System: root (multiple methods)
result = root(func, x0, method='hybr')

Definition:
Bisection Method

For a continuous $f : \mathbb{R} \to \mathbb{R}$ with $f(a) \cdot f(b) < 0$ (sign change), the bisection method repeatedly halves the interval:

Compute $c = (a + b) / 2$
If $f(a) \cdot f(c) < 0$ , set $b = c$ ; else set $a = c$
Repeat until $|b - a| < \epsilon$

Convergence is linear with rate $1/2$ : error halves each iteration. After $k$ iterations, $|x^\star - c_k| \le (b-a)/2^{k+1}$ .

from scipy.optimize import brentq

# brentq combines bisection with inverse quadratic interpolation
x = brentq(f, a, b, xtol=1e-12)

brentq is guaranteed to converge for continuous functions and is the most robust scalar root-finder in SciPy.

Bisection needs a bracket $[a,b]$ where $f$ changes sign. For finding brackets automatically, use scipy.optimize.brentq with a manual search or scipy.optimize.ridder.

Definition:
Newton's Method for Root Finding

For solving $f(x) = 0$ , Newton's method linearizes $f$ at $x_k$ :

$x_{k+1} = x_k - \frac{f(x_k)}{f'(x_k)}$

For systems $\mathbf{f}(\mathbf{x}) = \mathbf{0}$ :

$\mathbf{x}_{k+1} = \mathbf{x}_k - \mathbf{J}(\mathbf{x}_k)^{-1} \mathbf{f}(\mathbf{x}_k)$

where $\mathbf{J}$ is the Jacobian matrix $J_{ij} = \partial f_i / \partial x_j$ .

Convergence is quadratic near the root (digits of accuracy double each iteration), but the method can diverge from poor starting points.

from scipy.optimize import newton

# Scalar with derivative
x = newton(f, x0, fprime=df)

# System: use root with method='hybr' (Powell's hybrid)
res = root(func, x0, jac=jacobian, method='hybr')

Definition:
Fixed-Point Iteration

A fixed point of $g : \mathbb{R}^n \to \mathbb{R}^n$ is a point $\mathbf{x}^\star$ where $g(\mathbf{x}^\star) = \mathbf{x}^\star$ .

The iteration $\mathbf{x}_{k+1} = g(\mathbf{x}_k)$ converges to $\mathbf{x}^\star$ if $g$ is a contraction: $\|g(\mathbf{x}) - g(\mathbf{y})\| \le L\|\mathbf{x} - \mathbf{y}\|$ with $L < 1$ .

from scipy.optimize import fixed_point

# Solve x = cos(x)
x_star = fixed_point(np.cos, 0.5)
# x_star ≈ 0.7390851332 (Dottie number)

Root finding $f(\mathbf{x}) = \mathbf{0}$ is equivalent to fixed-point iteration with $g(\mathbf{x}) = \mathbf{x} - \alpha f(\mathbf{x})$ for suitable $\alpha$ .

Definition:
Anderson Acceleration (Anderson Mixing)

Anderson acceleration speeds up fixed-point iteration by combining the last $m$ iterates:

$\mathbf{x}_{k+1} = (1-\beta)\sum_{i=0}^{m_k} \theta_i \mathbf{x}_{k-i} + \beta \sum_{i=0}^{m_k} \theta_i g(\mathbf{x}_{k-i})$

where the weights $\theta_i$ are chosen to minimize the residual norm in a least-squares sense.

from scipy.optimize import root

# Anderson acceleration via root
def residual(x):
    return g(x) - x  # fixed point <=> root of residual

res = root(residual, x0, method='anderson')

Anderson acceleration is particularly effective for self-consistent field (SCF) iterations in electronic structure calculations and for ADMM in optimization.

Theorem: Banach Fixed-Point Theorem (Contraction Mapping)

If $g : D \to D$ is a contraction on a complete metric space $D$ with Lipschitz constant $L < 1$ :

$\|g(\mathbf{x}) - g(\mathbf{y})\| \le L \|\mathbf{x} - \mathbf{y}\| \quad \forall \mathbf{x}, \mathbf{y} \in D$

then $g$ has a unique fixed point $\mathbf{x}^\star \in D$ , and the iteration $\mathbf{x}_{k+1} = g(\mathbf{x}_k)$ converges to $\mathbf{x}^\star$ from any starting point $\mathbf{x}_0 \in D$ with:

$\|\mathbf{x}_k - \mathbf{x}^\star\| \le \frac{L^k}{1-L}\|\mathbf{x}_1 - \mathbf{x}_0\|$

A contraction shrinks distances. Repeated application of a shrinking map must converge to a unique fixed point — like repeatedly photocopying a photocopy, which converges to a single gray blob regardless of the original image.

Proof

Existence via Cauchy sequence

$\|\mathbf{x}_{k+1} - \mathbf{x}_k\| \le L \|\mathbf{x}_k - \mathbf{x}_{k-1}\| \le L^k \|\mathbf{x}_1 - \mathbf{x}_0\|$ . The geometric series $\sum L^k$ converges since $L < 1$ , so $\{\mathbf{x}_k\}$ is Cauchy and converges in the complete space.

Uniqueness

If $\mathbf{x}^\star, \mathbf{y}^\star$ are both fixed points: $\|\mathbf{x}^\star - \mathbf{y}^\star\| = \|g(\mathbf{x}^\star) - g(\mathbf{y}^\star)\| \le L\|\mathbf{x}^\star - \mathbf{y}^\star\|$ . Since $L < 1$ , this implies $\|\mathbf{x}^\star - \mathbf{y}^\star\| = 0$ .

Theorem: Local Quadratic Convergence of Newton's Method

If $f$ is twice continuously differentiable, $f(x^\star) = 0$ , $f'(x^\star) \neq 0$ , and $x_0$ is sufficiently close to $x^\star$ , then Newton's method converges quadratically:

$|x_{k+1} - x^\star| \le C |x_k - x^\star|^2$

where $C = |f''(x^\star)| / (2|f'(x^\star)|)$ . The number of correct digits approximately doubles each iteration.

Newton's method replaces $f$ by its tangent line and finds where the tangent crosses zero. Near the root, the tangent is an excellent approximation, so each step makes the error roughly quadratic.

Example: Solving a Nonlinear System with fsolve

Find the intersection of $x^2 + y^2 = 4$ and $xy = 1$ using scipy.optimize.fsolve.

Solution

Implementation

import numpy as np
from scipy.optimize import fsolve

def equations(vars):
    x, y = vars
    return [x**2 + y**2 - 4, x*y - 1]

def jacobian(vars):
    x, y = vars
    return [[2*x, 2*y], [y, x]]

# Multiple starting points to find all roots
starts = [(1, 1), (-1, -1), (1, -1), (-1, 1)]
roots = set()
for x0 in starts:
    sol = fsolve(equations, x0, fprime=jacobian, full_output=True)
    if sol[2] == 1:  # converged
        root = tuple(np.round(sol[0], 10))
        roots.add(root)

for r in sorted(roots):
    print(f"Root: x={r[0]:.6f}, y={r[1]:.6f}")

Verification

Four solutions exist (two pairs of sign reflections): $(x,y) \approx (\pm 1.932, \pm 0.518)$ and $(\pm 0.518, \pm 1.932)$ . Each satisfies both $x^2+y^2=4$ and $xy=1$ .

Example: Fixed-Point Iteration for Square Root

Compute $\sqrt{a}$ using the fixed-point iteration $x_{k+1} = \frac{1}{2}(x_k + a/x_k)$ (Babylonian method / Heron's method).

Solution

Implementation

import numpy as np

def sqrt_iteration(a, x0=1.0, tol=1e-15, max_iter=100):
    x = x0
    for k in range(max_iter):
        x_new = 0.5 * (x + a / x)
        if abs(x_new - x) < tol:
            print(f"Converged in {k+1} iterations")
            return x_new
        x = x_new
    return x

print(f"sqrt(2) = {sqrt_iteration(2):.15f}")
print(f"np ref  = {np.sqrt(2):.15f}")

Analysis

This is Newton's method applied to $f(x) = x^2 - a$ . The iteration $g(x) = (x + a/x)/2$ has $g'(x) = (1 - a/x^2)/2$ . At $x^\star = \sqrt{a}$ , $g'(x^\star) = 0$ , giving quadratic convergence. Typically converges to machine precision in ~5 iterations.

Root-Finding Methods Compared

Compare bisection, Newton's method, and secant method on scalar root-finding problems.

Parameters

Root Finding and Fixed-Point Iteration

python

fsolve, root, brentq, fixed_point, Anderson acceleration examples.

# Code from: ch08/python/root_finding.py
# Load from backend supplements endpoint

Quick Check

Which root-finding method is guaranteed to converge for a continuous function with a sign change on [a, b]?

Newton's method

Secant method

Bisection (brentq)

Fixed-point iteration

Correction:

Bisection (brentq)

Bisection is guaranteed by the intermediate value theorem: if f(a)f(b) < 0 and f is continuous, there must be a root in [a,b].

Common Mistake: fsolve Silently Returns Non-Solution

Mistake:

Using fsolve(f, x0) and assuming the returned value is a root. By default, fsolve does not raise an error if it fails to converge.

Correction:

Always check the convergence flag:

x, info, ier, msg = fsolve(f, x0, full_output=True)
if ier != 1:
    print(f"WARNING: {msg}")

Or use root which returns a result.success boolean.

Key Takeaway

For scalar roots, use brentq; for systems, use root. brentq is guaranteed to converge for bracketed continuous functions. For systems, root(method='hybr') is the default workhorse. Always provide the Jacobian when available for faster convergence, and always check result.success.

Historical Note: Newton-Raphson Method

1660s

Isaac Newton described his method for root-finding in De analysi (1669), but only for polynomial equations. Joseph Raphson independently published the general form in 1690. Thomas Simpson generalized it to systems of equations in 1740. The method remains the foundation of most modern nonlinear solvers, including SciPy's fsolve (which uses a modified Powell hybrid method combining Newton steps with dogleg trust regions).

Fixed Point

A point $\mathbf{x}^\star$ where $g(\mathbf{x}^\star) = \mathbf{x}^\star$ . Solving $f(\mathbf{x}) = \mathbf{0}$ is equivalent to finding a fixed point of $g(\mathbf{x}) = \mathbf{x} - f(\mathbf{x})$ .

Contraction Mapping

A function $g$ with Lipschitz constant $L < 1$ : $\|g(\mathbf{x}) - g(\mathbf{y})\| \le L\|\mathbf{x} - \mathbf{y}\|$ . The Banach theorem guarantees convergence of iteration $x_{k+1} = g(x_k)$ .

Related: Fixed Point

Jacobian Matrix

The $n \times n$ matrix of partial derivatives $J_{ij} = \partial f_i / \partial x_j$ for a vector-valued function $\mathbf{f} : \mathbb{R}^n \to \mathbb{R}^n$ .

Related: Hessian

Root Finding and Fixed-Point Iteration

Root Finding: Where Optimization Meets Equations

Definition: Root-Finding Problem

Definition: Bisection Method

Definition: Newton's Method for Root Finding

Definition: Fixed-Point Iteration

Definition: Anderson Acceleration (Anderson Mixing)

Theorem: Banach Fixed-Point Theorem (Contraction Mapping)

Existence via Cauchy sequence

Uniqueness

Theorem: Local Quadratic Convergence of Newton's Method

Example: Solving a Nonlinear System with fsolve

Implementation

Verification

Example: Fixed-Point Iteration for Square Root

Implementation

Analysis

Root-Finding Methods Compared

Parameters

Root Finding and Fixed-Point Iteration

Quick Check

Common Mistake: fsolve Silently Returns Non-Solution

Key Takeaway

Historical Note: Newton-Raphson Method

Fixed Point

Contraction Mapping

Jacobian Matrix

Definition:
Root-Finding Problem

Definition:
Bisection Method

Definition:
Newton's Method for Root Finding

Definition:
Fixed-Point Iteration

Definition:
Anderson Acceleration (Anderson Mixing)