References & Further Reading

References

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, et al., PyTorch: An Imperative Style, High-Performance Deep Learning Library, NeurIPS, 2019
The foundational paper describing PyTorch's design philosophy: define-by-run autograd, eager execution, and the tensor abstraction. Essential reading for understanding why PyTorch works the way it does.
A. G. Baydin, B. A. Pearlmutter, A. A. Radul, J. M. Siskind, Automatic Differentiation in Machine Learning: A Survey, Journal of Machine Learning Research, 2018
Comprehensive survey of forward-mode and reverse-mode automatic differentiation. Explains why reverse-mode (backpropagation) is optimal for scalar-output functions with many parameters.
D. H. Brandwood, A Complex Gradient Operator and Its Application in Adaptive Array Theory, IEE Proceedings F, 1983
The paper that brought Wirtinger calculus to engineering. Shows that the conjugate Wirtinger derivative is the correct gradient for optimizing real-valued functions of complex variables.
PyTorch Contributors, torch.linalg Documentation, 2025. [Link]
Official reference for all torch.linalg functions including SVD, eigendecomposition, solve, Cholesky, and their batched variants.
DMLC Community, DLPack: Open In Memory Tensor Structure, 2023. [Link]
The specification for the DLPack tensor exchange protocol. Describes the memory layout contract that enables zero-copy sharing between NumPy, PyTorch, CuPy, JAX, and TensorFlow.
Consortium for Python Data API Standards, Python Array API Standard, 2023. [Link]
The specification for a common array API across Python libraries. Enables backend-agnostic code that works with NumPy, PyTorch, CuPy, and JAX without modification.