Chapter Summary

Key Points

1.
PyTorch tensors are NumPy arrays with GPU and autograd. The API is deliberately similar, but the default dtype is float32 (not float64). Use torch.float64 explicitly for high-precision scientific computing. All operands must reside on the same device.
2.
Autograd computes exact gradients via reverse-mode AD. Set requires_grad=True on parameters, call .backward() on a scalar loss, and read gradients from .grad. Gradients accumulate by default — zero them between iterations. Use torch.no_grad() for inference and .detach() to sever graph connections.
3.
Complex tensors use Wirtinger calculus. PyTorch returns the conjugate Wirtinger derivative $\partial L / \partial z^*$ , which is the steepest-descent direction for real-valued losses of complex parameters. This makes gradient descent on complex parameters work identically to the real case.
4.
Batch your linear algebra. torch.linalg mirrors NumPy's API but adds batched operations over leading dimensions and GPU acceleration. Processing 10,000 matrices at once is vastly faster than looping — this is essential for MIMO-OFDM and Monte Carlo simulations.
5.
Use zero-copy conversion between frameworks. torch.from_numpy and .numpy() share memory on CPU. DLPack enables zero-copy GPU sharing between PyTorch, CuPy, and JAX. The Array API standard lets you write framework-agnostic code.
6.
Avoid in-place ops in autograd, NaN gradients from degenerate decompositions, and .numpy() on GPU tensors. These are the three most common sources of bugs when using PyTorch for scientific computing.

Looking Ahead

Chapter 13 moves to SciPy's optimization module, where PyTorch's autograd can provide exact gradients to optimizers that traditionally require finite-difference approximations. The interoperability patterns from Section 12.5 let you combine SciPy's solvers with PyTorch's GPU-accelerated gradient computation.

Interoperability: NumPy, CuPy, PyTorch Exercises