

genrl.environments.action_wrappers module

class genrl.environments.action_wrappers.ClipAction(env: Union[gym.core.Env, genrl.environments.vec_env.vector_envs.VecEnv])[source]

Bases: gym.core.ActionWrapper

Action Wrapper to clip actions

Parameters:env (object) – The environment whose actions need to be clipped
action(action: numpy.ndarray) → numpy.ndarray[source]
class genrl.environments.action_wrappers.RescaleAction(env: Union[gym.core.Env, genrl.environments.vec_env.vector_envs.VecEnv], low: int, high: int)[source]

Bases: gym.core.ActionWrapper

Action Wrapper to rescale actions

  • env (object) – The environment whose actions need to be rescaled
  • low (int) – Lower limit of action
  • high (int) – Upper limit of action
action(action: numpy.ndarray) → numpy.ndarray[source]

genrl.environments.atari_preprocessing module

class genrl.environments.atari_preprocessing.AtariPreprocessing(env: gym.core.Env, frameskip: Union[Tuple, int] = (2, 5), grayscale: bool = True, screen_size: int = 84)[source]

Bases: gym.core.Wrapper

Implementation for Image preprocessing for Gym Atari environments. Implements: 1) Frameskip 2) Grayscale 3) Downsampling to square image

param env:Atari environment
param frameskip:
 Number of steps between actions. E.g. frameskip=4 will mean 1 action will be taken for every 4 frames. It’ll be a tuple
if non-deterministic and a random number will be chosen from (2, 5)
param grayscale:
 Whether or not the output should be converted to grayscale
param screen_size:
 Size of the output screen (square output)
type env:Gym Environment
type frameskip:tuple or int
type grayscale:boolean
type screen_size:
reset() → numpy.ndarray[source]

Resets state of environment

Returns:Initial state
Return type:NumPy array
step(action: numpy.ndarray) → numpy.ndarray[source]

Step through Atari environment for given action

Parameters:action (NumPy array) – Action taken by agent
Returns:Current state, reward(for frameskip number of actions), done, info

genrl.environments.atari_wrappers module

class genrl.environments.atari_wrappers.FireReset(env: gym.core.Env)[source]

Bases: gym.core.Wrapper

Some Atari environments do not actually do anything until a specific action (the fire action) is taken, so we make it take the action before starting the training process

Parameters:env (Gym Environment) – Atari environment
reset() → numpy.ndarray[source]

Resets state of environment. Performs the noop action a random number of times to introduce stochasticity

Returns:Initial state
Return type:NumPy array
class genrl.environments.atari_wrappers.NoopReset(env: gym.core.Env, max_noops: int = 30)[source]

Bases: gym.core.Wrapper

Some Atari environments always reset to the same state. So we take a random number of some empty (noop) action to introduce some stochasticity.

  • env (Gym Environment) – Atari environment
  • max_noops (int) – Maximum number of Noops to be taken
reset() → numpy.ndarray[source]

Resets state of environment. Performs the noop action a random number of times to introduce stochasticity

Returns:Initial state
Return type:NumPy array
step(action: numpy.ndarray) → numpy.ndarray[source]

Step through underlying Atari environment for given action

Parameters:action (NumPy array) – Action taken by agent
Returns:Current state, reward(for frameskip number of actions), done, info

genrl.environments.base_wrapper module

class genrl.environments.base_wrapper.BaseWrapper(env: Any, batch_size: int = None)[source]

Bases: abc.ABC

Base class for all wrappers


The number of batches trained per update

close() → None[source]

Closes environment and performs any other cleanup

Must be overridden by subclasses

render() → None[source]

Render the environment

reset() → None[source]

Resets state of environment

Must be overriden by subclasses

Returns:Initial state
seed(seed: int = None) → None[source]

Set seed for environment

step(action: numpy.ndarray) → None[source]

Step through the environment

Must be overriden by subclasses

genrl.environments.frame_stack module

class genrl.environments.frame_stack.FrameStack(env: gym.core.Env, framestack: int = 4, compress: bool = True)[source]

Bases: gym.core.Wrapper

Wrapper to stack the last few(4 by default) observations of agent efficiently

  • env (Gym Environment) – Environment to be wrapped
  • framestack (int) – Number of frames to be stacked
  • compress (bool) – True if we want to use LZ4 compression to conserve memory usage
reset() → numpy.ndarray[source]

Resets environment

Returns:Initial state of environment
Return type:NumPy Array
step(action: numpy.ndarray) → numpy.ndarray[source]

Steps through environment

Parameters:action (NumPy Array) – Action taken by agent
Returns:Next state, reward, done, info
Return type:NumPy Array, float, boolean, dict
class genrl.environments.frame_stack.LazyFrames(frames: List[T], compress: bool = False)[source]

Bases: object

Efficient data structure to save each frame only once. Can use LZ4 compression to optimizer memory usage.

  • frames (collections.deque) – List of frames that needs to converted to a LazyFrames data structure
  • compress (boolean) – True if we want to use LZ4 compression to conserve memory usage

Returns dimensions of other object

genrl.environments.gym_wrapper module

class genrl.environments.gym_wrapper.GymWrapper(env: gym.core.Env)[source]

Bases: gym.core.Wrapper

Wrapper class for all Gym Environments

  • env (string) – Gym environment name
  • n_envs (None, int) – Number of environments. None if not vectorised
  • parallel (boolean) – If vectorised, should environments be run through serially or parallelly
close() → None[source]

Closes environment

render(mode: str = 'human') → None[source]

Renders all envs in a tiles format similar to baselines.

Parameters:mode (string) – Can either be ‘human’ or ‘rgb_array’. Displays tiled images in ‘human’ and returns tiled images in ‘rgb_array’
reset() → numpy.ndarray[source]

Resets environment

Returns:Initial state
sample() → numpy.ndarray[source]

Shortcut method to directly sample from environment’s action space

Returns:Random action from action space
Return type:NumPy Array
seed(seed: int = None) → None[source]

Set environment seed

Parameters:seed (int) – Value of seed
step(action: numpy.ndarray) → numpy.ndarray[source]

Steps the env through given action

Parameters:action (NumPy array) – Action taken by agent
Returns:Next observation, reward, game status and debugging info

genrl.environments.suite module

genrl.environments.suite.AtariEnv(env_id: str, wrapper_list: List[T] = [<class 'genrl.environments.atari_preprocessing.AtariPreprocessing'>, <class 'genrl.environments.atari_wrappers.NoopReset'>, <class 'genrl.environments.atari_wrappers.FireReset'>, <class 'genrl.environments.time_limit.AtariTimeLimit'>, <class 'genrl.environments.frame_stack.FrameStack'>]) → gym.core.Env[source]

Function to apply wrappers for all Atari envs by Trainer class

  • env (string) – Environment Name
  • wrapper_list (list or tuple) – List of wrappers to use

Gym Atari Environment

Return type:


genrl.environments.suite.GymEnv(env_id: str) → gym.core.Env[source]

Function to apply wrappers for all regular Gym envs by Trainer class

Parameters:env (string) – Environment Name
Returns:Gym Environment
Return type:object
genrl.environments.suite.VectorEnv(env_id: str, n_envs: int = 2, parallel: int = False, env_type: str = 'gym') → genrl.environments.vec_env.vector_envs.VecEnv[source]

Chooses the kind of Vector Environment that is required

param env_id:Gym environment to be vectorised
param n_envs:Number of environments
param parallel:True if we want environments to run parallely and (
subprocesses, False if we want environments to run serially one after the other)
param env_type:Type of environment. Currently, we support [“gym”, “atari”]
type env_id:string
type n_envs:int
type parallel:False
type env_type:string
returns:Vector Environment

genrl.environments.time_limit module

class genrl.environments.time_limit.AtariTimeLimit(env, max_episode_len=None)[source]

Bases: gym.core.Wrapper


Resets the state of the environment and returns an initial observation.

Returns:the initial observation.
Return type:observation (object)

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:action (object) – an action provided by the agent
Returns:agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:observation (object)
class genrl.environments.time_limit.TimeLimit(env, max_episode_len=None)[source]

Bases: gym.core.Wrapper


Resets the state of the environment and returns an initial observation.

Returns:the initial observation.
Return type:observation (object)

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:action (object) – an action provided by the agent
Returns:agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:observation (object)

Module contents