Logo
latest

User Guide

  • Installation
  • About
  • Tutorials
    • Bandit Tutorials
    • Classical
    • Deep RL Tutorials
    • Custom Policy Networks
    • Using A2C
    • Using Shared Parameters in Actor Critic Agents in GenRL
    • Vanilla Policy Gradient (VPG)
    • Saving and Loading Weights and Hyperparameters with GenRL

API

  • Agents
  • Environments
  • Core
  • Utilities
  • Trainers
  • Common
GenRL
  • Docs »
  • Tutorials
  • Edit on GitHub

Tutorials¶

  • Bandit Tutorials
    • Multi Armed Bandit Overview
    • Contextual Bandits Overview
    • UCB
    • Thompson Sampling
    • Bayesian
    • Gradients
    • Linear Posterior Inference
    • Variational Inference
    • Bootstrap
    • Parameter Noise Sampling
    • Adding a new Data Bandit
    • Adding a new Deep Contextual Bandit Agent
  • Classical
    • Q-Learning using GenRL
    • SARSA using GenRL
  • Deep RL Tutorials
    • Deep Reinforcement Learning Background
    • Vanilla Policy Gradient
    • Advantage Actor Critic
    • Proximal Policy Optimization
    • Deep Q-Networks (DQN)
    • Double Deep Q-Network
    • Dueling Deep Q-Network
    • Deep Q Networks with Noisy Nets
    • Prioritized Deep Q-Networks
    • Deep Deterministic Policy Gradients
    • Twin Delayed DDPG
    • Soft Actor-Critic
    • Categorical Deep Q-Networks
  • Custom Policy Networks
  • Using A2C
    • Using A2C on “CartPole-v0”
    • Using A2C on atari env - “Pong-v0”
  • Using Shared Parameters in Actor Critic Agents in GenRL
  • Vanilla Policy Gradient (VPG)
    • VPG agent on a Cartpole Environment
    • VPG agent on an Atari Environment
  • Saving and Loading Weights and Hyperparameters with GenRL
Next Previous

© Copyright 2020, Society for Artificial Intelligence and Deep Learning (SAiDL) Revision ce767e43.

Built with Sphinx using a theme provided by Read the Docs.