References & Further Reading
References
- S. K. Lam, A. Pitrou, and S. Seibert, Numba: A LLVM-Based Python JIT Compiler, LLVM-HPC Workshop, SC15, 2015
The original Numba paper describing the architecture of the LLVM-based JIT compiler for NumPy-centric Python code. Covers type inference, compilation pipeline, and GPU code generation.
- J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang, JAX: Composable Transformations of Python+NumPy Programs, 2018
The JAX documentation and design paper. Describes the functional transformation approach (jit, grad, vmap, pmap) and the XLA compilation backend.
- G. M. Amdahl, Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities, AFIPS Conference Proceedings, 1967
The seminal paper on parallel speedup limitations. Shows that the serial fraction of a program fundamentally limits the achievable speedup regardless of the number of processors.
- W. Jakob, J. Rhinelander, and D. Moldovan, pybind11 -- Seamless Operability Between C++11 and Python, 2017
The pybind11 documentation. Describes the header-only C++ library for creating Python bindings with automatic type conversion, NumPy support, and STL container handling.
- J. L. Gustafson, Reevaluating Amdahl's Law, Communications of the ACM, 1988
Introduces scaled speedup (Gustafson's Law), showing that parallel speedup can grow linearly if the problem size scales with the number of processors.
- C. Lattner and V. Adve, LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation, CGO, 2004
The foundational LLVM paper describing the modular compiler infrastructure that Numba uses for code generation.
- A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, Automatic Differentiation in Machine Learning: A Survey, JMLR, 2018
Comprehensive survey of automatic differentiation techniques including forward mode, reverse mode, and their implementations in modern ML frameworks.
Further Reading
Numba documentation and tutorials
Numba documentation (https://numba.readthedocs.io/)
The official Numba documentation covers all decorators, supported Python and NumPy features, CUDA programming, and performance tips.
JAX documentation
JAX documentation (https://jax.readthedocs.io/)
Comprehensive guide to JAX's functional transformations, XLA compilation, and the growing ecosystem (Flax, Optax, Haiku).
Cython for static compilation
K. W. Smith, *Cython*, O'Reilly, 2015
Cython offers an alternative to Numba: you annotate Python code with C types and compile to a C extension. More control than Numba but requires a separate compilation step.
High-performance Python
M. Gorelick and I. Ozsvald, *High Performance Python*, 2nd ed., O'Reilly, 2020
Covers profiling, Cython, Numba, multiprocessing, and distributed computing with a focus on practical optimization strategies.