stable
User Guide
Installation
About
Tutorials
Bandit Tutorials
Classical
Deep RL Tutorials
Custom Policy Networks
Using A2C
Vanilla Policy Gradient (VPG)
API
Agents
Environments
Core
Utilities
Trainers
Common
GenRL
Docs
»
Tutorials
Edit on GitHub
Tutorials
¶
Bandit Tutorials
Multi Armed Bandit Overview
Contextual Bandits Overview
UCB
Thompson Sampling
Bayesian
Gradients
Linear Posterior Inference
Variational Inference
Bootstrap
Parameter Noise Sampling
Adding a new Data Bandit
Adding a new Deep Contextual Bandit Agent
Classical
Q-Learning using GenRL
SARSA using GenRL
Deep RL Tutorials
Deep Reinforcement Learning Background
Vanilla Policy Gradient
Advantage Actor Critic
Proximal Policy Optimization
Custom Policy Networks
Using A2C
Using A2C on “CartPole-v0”
Using A2C on atari env - “Pong-v0”
Vanilla Policy Gradient (VPG)
VPG agent on a Cartpole Environment
VPG agent on an Atari Environment