Prerequisites & Notation
Prerequisites
This chapter assumes familiarity with:
- NumPy and linear algebra (Chapters 5-6): matrix operations, SVD
- PyTorch fundamentals (Chapter 26): tensors, autograd, nn.Module
- Recurrent networks (Chapter 30): RNNs, LSTMs, sequence modeling
We introduce NLP from the ground up, starting with how to represent text as numbers, progressing through embeddings, and culminating in language modeling with both RNNs and transformers.
Definition: Notation for This Chapter
Notation for This Chapter
| Symbol | Meaning |
|---|---|
| Vocabulary (set of all tokens) | |
| Vocabulary size | |
| Embedding dimension | |
| Embedding matrix | |
| Embedding vector for word | |
| Sequence length | |
| Token at position | |
| Next-token probability |