Prerequisites & Notation
Before You Begin
This chapter assumes familiarity with NumPy array manipulation (Chapter 5),
basic linear algebra (Chapter 6), and gradient-based optimisation concepts
(Chapter 8). You should also have a working PyTorch installation
(pip install torch).
- NumPy arrays, broadcasting, and dtypes (Chapter 5)(Review ch05)
Self-check: Can you reshape, slice, and broadcast NumPy arrays confidently?
- Linear algebra: matrix-vector products, eigendecomposition (Chapter 6)(Review ch06)
Self-check: Can you compute in NumPy?
- Python classes, inheritance, and
__init__/__call__protocols (Chapter 3)(Review ch03)Self-check: Can you write a class with
super().__init__()and override a method?
Notation for This Chapter
Symbols and conventions used throughout this chapter.
| Symbol | Meaning | Introduced |
|---|---|---|
| , | Weight matrix and bias vector of a linear layer | s01 |
| Activation function (ReLU, sigmoid, etc.) | s01 | |
| Loss function measuring prediction error | s02 | |
| Learning rate | s02 | |
| Collective model parameters | s01 | |
| \\nabla_\\theta L | Gradient of loss with respect to parameters | s02 |
| Model prediction (network output) | s02 | |
| Mini-batch size | s04 |