References & Further Reading

References

  1. J. Nocedal and S. J. Wright, Numerical Optimization, Springer, 2006

    The standard graduate textbook on numerical optimization. Covers line search, trust region, conjugate gradient, quasi-Newton (BFGS/L-BFGS), constrained optimization (SQP, interior point), and large-scale methods. Essential reference for Sections 8.1-8.2.

  2. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004

    The foundational textbook on convex optimization. Covers duality, KKT conditions, LP, QP, SOCP, SDP, and interior-point methods. Free PDF available from the authors. Directly relevant to Sections 8.2-8.3.

  3. S. Diamond and S. Boyd, CVXPY: A Python-Embedded Modeling Language for Convex Optimization, Journal of Machine Learning Research, 17(83):1-5, 2016

    The original CVXPY paper describing the DCP verification system and the automatic problem transformation pipeline.

  4. N. Parikh and S. Boyd, Proximal Algorithms, Foundations and Trends in Optimization, 1(3):127-239, 2014

    Comprehensive survey of proximal operators, ADMM, and related algorithms. Includes a catalog of proximal operators for common functions. Essential reference for Section 8.4.

  5. A. Beck, First-Order Methods in Optimization, SIAM, 2017

    Detailed treatment of gradient descent, proximal gradient (ISTA), accelerated methods (FISTA), and their convergence theory. Covers the proximal operators from Section 8.4.

  6. SciPy Community, scipy.optimize — Optimization and Root Finding, 2024

    Official documentation for SciPy's optimization module. Detailed descriptions of minimize, linprog, root, fsolve, and all solver options.

  7. M. X. Goemans and D. P. Williamson, Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming, Journal of the ACM, 42(6):1115-1145, 1995

    The landmark paper introducing SDP relaxation for MAX-CUT with the 0.878 approximation guarantee. One of the most influential results in combinatorial optimization.

  8. R. Tibshirani, Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society B, 58(1):267-288, 1996

    The original LASSO paper. Introduced L1-regularized regression for simultaneous estimation and variable selection.

Further Reading

  • Interior-point methods for LP and SDP

    S. Wright, *Primal-Dual Interior-Point Methods* (SIAM, 1997)

    Deep dive into the algorithms behind modern LP and SDP solvers like HiGHS and MOSEK.

  • Global optimization

    SciPy docs: `differential_evolution`, `basinhopping`, `dual_annealing`

    When local methods are insufficient, these global optimizers explore the full search space.

  • Automatic differentiation for gradients

    JAX documentation (jax.readthedocs.io)

    JAX's `grad` and `hessian` eliminate the need to derive gradients by hand, enabling rapid prototyping of optimization problems.

  • Convex optimization in signal processing

    Y. C. Eldar, *Convex Optimization in Signal Processing and Communications* (Cambridge, 2010)

    Applies SOCP, SDP, and LASSO to beamforming, detection, and channel estimation — bridges this chapter to wireless communications.

  • Anderson acceleration theory

    H. F. Walker and P. Ni, *Anderson Acceleration for Fixed-Point Iterations* (SIAM J. Numer. Anal., 2011)

    Rigorous convergence analysis of Anderson mixing, explaining when and why it accelerates fixed-point iterations.