References & Further Reading
References
- J. Nocedal and S. J. Wright, Numerical Optimization, Springer, 2006
The standard graduate textbook on numerical optimization. Covers line search, trust region, conjugate gradient, quasi-Newton (BFGS/L-BFGS), constrained optimization (SQP, interior point), and large-scale methods. Essential reference for Sections 8.1-8.2.
- S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004
The foundational textbook on convex optimization. Covers duality, KKT conditions, LP, QP, SOCP, SDP, and interior-point methods. Free PDF available from the authors. Directly relevant to Sections 8.2-8.3.
- S. Diamond and S. Boyd, CVXPY: A Python-Embedded Modeling Language for Convex Optimization, Journal of Machine Learning Research, 17(83):1-5, 2016
The original CVXPY paper describing the DCP verification system and the automatic problem transformation pipeline.
- N. Parikh and S. Boyd, Proximal Algorithms, Foundations and Trends in Optimization, 1(3):127-239, 2014
Comprehensive survey of proximal operators, ADMM, and related algorithms. Includes a catalog of proximal operators for common functions. Essential reference for Section 8.4.
- A. Beck, First-Order Methods in Optimization, SIAM, 2017
Detailed treatment of gradient descent, proximal gradient (ISTA), accelerated methods (FISTA), and their convergence theory. Covers the proximal operators from Section 8.4.
- SciPy Community, scipy.optimize — Optimization and Root Finding, 2024
Official documentation for SciPy's optimization module. Detailed descriptions of minimize, linprog, root, fsolve, and all solver options.
- M. X. Goemans and D. P. Williamson, Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming, Journal of the ACM, 42(6):1115-1145, 1995
The landmark paper introducing SDP relaxation for MAX-CUT with the 0.878 approximation guarantee. One of the most influential results in combinatorial optimization.
- R. Tibshirani, Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society B, 58(1):267-288, 1996
The original LASSO paper. Introduced L1-regularized regression for simultaneous estimation and variable selection.
Further Reading
Interior-point methods for LP and SDP
S. Wright, *Primal-Dual Interior-Point Methods* (SIAM, 1997)
Deep dive into the algorithms behind modern LP and SDP solvers like HiGHS and MOSEK.
Global optimization
SciPy docs: `differential_evolution`, `basinhopping`, `dual_annealing`
When local methods are insufficient, these global optimizers explore the full search space.
Automatic differentiation for gradients
JAX documentation (jax.readthedocs.io)
JAX's `grad` and `hessian` eliminate the need to derive gradients by hand, enabling rapid prototyping of optimization problems.
Convex optimization in signal processing
Y. C. Eldar, *Convex Optimization in Signal Processing and Communications* (Cambridge, 2010)
Applies SOCP, SDP, and LASSO to beamforming, detection, and channel estimation — bridges this chapter to wireless communications.
Anderson acceleration theory
H. F. Walker and P. Ni, *Anderson Acceleration for Fixed-Point Iterations* (SIAM J. Numer. Anal., 2011)
Rigorous convergence analysis of Anderson mixing, explaining when and why it accelerates fixed-point iterations.