Chapter Summary
Chapter Summary
Key Points
- 1.
Match the optimizer to your information. Use Nelder-Mead only when gradients are unavailable. BFGS is the default for smooth, unconstrained problems with gradients. Newton-CG and trust-ncg give quadratic convergence when Hessians are cheap. L-BFGS-B handles large-scale problems and box constraints with O(mn) memory.
- 2.
Always supply analytical gradients. Finite-difference gradients cost O(n) extra function evaluations per step and introduce truncation error. Use
scipy.optimize.check_gradto verify your gradient implementation. Providing the Hessian or Hessian-vector products further accelerates convergence. - 3.
Use CVXPY for convex problems. CVXPY's DCP rules guarantee that your problem is convex at construction time. The modeling language separates problem formulation from solver selection. LASSO, SDP, and SOCP are all natural in CVXPY. For non-convex problems, fall back to SciPy with multiple restarts.
- 4.
Proximal operators are the building blocks of modern optimization. Soft-thresholding (L1 proximal) gives sparsity, projection (indicator proximal) gives feasibility, group soft-thresholding gives group sparsity, and TV proximal gives piecewise-constant denoising. ISTA converges at O(1/k); FISTA at O(1/k^2).
- 5.
For root finding, match the method to the problem structure. Use brentq for bracketed scalar roots (guaranteed convergence). Use root(method='hybr') for nonlinear systems. Always provide the Jacobian when available and always check convergence flags. The Banach fixed-point theorem provides the theoretical foundation for iterative methods.
- 6.
Constrained optimization requires careful attention to conventions. SciPy's SLSQP uses g(x) >= 0 for inequality constraints (opposite of many textbooks). Use trust-constr with NonlinearConstraint objects to avoid sign confusion. For linear programs, use linprog with the HiGHS solver.
Looking Ahead
Chapter 9 applies these optimization tools to interpolation and approximation: fitting curves, building surrogate models, and solving inverse problems. The regularization techniques from this chapter (LASSO, Tikhonov) reappear as tools for stabilizing ill-posed approximation problems.