References & Further Reading

References

J. Nocedal and S. J. Wright, Numerical Optimization, Springer, 2006
The standard graduate textbook on numerical optimization. Covers line search, trust region, conjugate gradient, quasi-Newton (BFGS/L-BFGS), constrained optimization (SQP, interior point), and large-scale methods. Essential reference for Sections 8.1-8.2.
S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004
The foundational textbook on convex optimization. Covers duality, KKT conditions, LP, QP, SOCP, SDP, and interior-point methods. Free PDF available from the authors. Directly relevant to Sections 8.2-8.3.
S. Diamond and S. Boyd, CVXPY: A Python-Embedded Modeling Language for Convex Optimization, Journal of Machine Learning Research, 17(83):1-5, 2016
The original CVXPY paper describing the DCP verification system and the automatic problem transformation pipeline.
N. Parikh and S. Boyd, Proximal Algorithms, Foundations and Trends in Optimization, 1(3):127-239, 2014
Comprehensive survey of proximal operators, ADMM, and related algorithms. Includes a catalog of proximal operators for common functions. Essential reference for Section 8.4.
A. Beck, First-Order Methods in Optimization, SIAM, 2017
Detailed treatment of gradient descent, proximal gradient (ISTA), accelerated methods (FISTA), and their convergence theory. Covers the proximal operators from Section 8.4.
SciPy Community, scipy.optimize — Optimization and Root Finding, 2024
Official documentation for SciPy's optimization module. Detailed descriptions of minimize, linprog, root, fsolve, and all solver options.
M. X. Goemans and D. P. Williamson, Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming, Journal of the ACM, 42(6):1115-1145, 1995
The landmark paper introducing SDP relaxation for MAX-CUT with the 0.878 approximation guarantee. One of the most influential results in combinatorial optimization.
R. Tibshirani, Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society B, 58(1):267-288, 1996
The original LASSO paper. Introduced L1-regularized regression for simultaneous estimation and variable selection.