References & Further Reading

References

  1. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018

    The definitive RL textbook. Covers everything from bandits to policy gradients.

  2. V. Mnih et al., Human-Level Control Through Deep Reinforcement Learning, Nature, 2015

    The DQN paper: deep RL playing Atari from pixels.

  3. J. Schulman et al., Proximal Policy Optimization Algorithms, arXiv:1707.06347, 2017

    PPO: simple, stable policy gradient with clipped surrogate.

  4. N. C. Luong et al., Applications of Deep Reinforcement Learning in Communications and Networking, IEEE Comm. Surveys, 2019

    Survey of DRL for wireless: power control, scheduling, resource allocation.

Further Reading

  • Spinning Up in Deep RL

    https://spinningup.openai.com

    Excellent educational resource with clean implementations of RL algorithms.

  • Stable Baselines 3

    https://stable-baselines3.readthedocs.io

    Production-quality implementations of PPO, DQN, SAC in PyTorch.