Prerequisites & Notation

Before You Begin

This chapter requires nn.Module and training loops (Chapter 26). Probability and statistics background (Chapter 9) is helpful.

  • PyTorch nn.Module and training (Chapter 26)(Review ch26)

    Self-check: Can you train a neural network with PyTorch?

  • Probability and expectation (Chapter 9)(Review ch09)

    Self-check: Do you understand expected value and conditional probability?

Notation for This Chapter

SymbolMeaningIntroduced
mathcalS,mathcalA\\mathcal{S}, \\mathcal{A}State space, action spaces01
pi(as)\\pi(a|s)Policy: probability of action aa in state sss01
Q(s,a)Q(s, a)Action-value functions01
V(s)V(s)State-value functions01
gamma\\gammaDiscount factors01
RtR_tReward at time tts01