Prerequisites & Notation
Prerequisites
This chapter builds on:
- NLP Foundations (Chapter 34): tokenization, embeddings, attention
- Deep learning (Chapters 26, 30): PyTorch, training loops, RNNs
- Transformer basics (Chapter 32): encoder-decoder architecture
We go deep into how GPT-family models work, how they are trained at scale, and how RLHF aligns them with human preferences.
Definition: Notation for This Chapter
Notation for This Chapter
| Symbol | Meaning |
|---|---|
| Number of transformer layers | |
| Hidden dimension | |
| Number of attention heads | |
| Per-head dimension | |
| Number of model parameters | |
| Dataset size (tokens) | |
| Compute budget (FLOPs) | |
| Model parameters | |
| Policy (the LLM as a policy) | |
| Reward model |