Prerequisites & Notation

Prerequisites

This chapter assumes familiarity with:

  • NumPy and linear algebra (Chapters 5-6): matrix operations, SVD
  • PyTorch fundamentals (Chapter 26): tensors, autograd, nn.Module
  • Recurrent networks (Chapter 30): RNNs, LSTMs, sequence modeling

We introduce NLP from the ground up, starting with how to represent text as numbers, progressing through embeddings, and culminating in language modeling with both RNNs and transformers.

Definition:

Notation for This Chapter

Symbol Meaning
V\mathcal{V} Vocabulary (set of all tokens)
V=VV = |\mathcal{V}| Vocabulary size
dd Embedding dimension
ERV×d\mathbf{E} \in \mathbb{R}^{V \times d} Embedding matrix
ewRd\mathbf{e}_w \in \mathbb{R}^d Embedding vector for word ww
TT Sequence length
wtw_t Token at position tt
P(wtw<t)P(w_t \mid w_{<t}) Next-token probability