Prerequisites & Notation

This chapter requires nn.Module and training loops (Chapter 26). Probability and statistics background (Chapter 9) is helpful.

PyTorch nn.Module and training (Chapter 26)(Review ch26)
Self-check: Can you train a neural network with PyTorch?
Probability and expectation (Chapter 9)(Review ch09)
Self-check: Do you understand expected value and conditional probability?

Symbol	Meaning	Introduced
$\\mathcal{S}, \\mathcal{A}$	State space, action space	s01
$\\pi(a\|s)$	Policy: probability of action $a$ in state $s$	s01
$Q(s, a)$	Action-value function	s01
$V(s)$	State-value function	s01
$\\gamma$	Discount factor	s01
$R_t$	Reward at time $t$	s01