Prerequisites & Notation
Before You Begin
This chapter requires nn.Module (Chapter 26) and understanding of sequence modelling (Chapter 29).
- nn.Module and training (Chapter 26)(Review ch26)
Self-check: Can you implement and train custom modules?
- Sequence models and hidden states (Chapter 29)(Review ch29)
Self-check: Do you understand encoder-decoder and sequence processing?
Notation for This Chapter
| Symbol | Meaning | Introduced |
|---|---|---|
| Query, key, value matrices | s01 | |
| Dimension of key/query vectors | s01 | |
| Model embedding dimension | s02 | |
| Number of attention heads | s01 | |
| Number of transformer layers (blocks) | s02 |