Environments¶

Subpackages¶

Vectorized Envrionments

Submodules¶

genrl.environments.action_wrappers module¶

class genrl.environments.action_wrappers.ClipAction(env: Union[gym.core.Env, genrl.environments.vec_env.vector_envs.VecEnv])[source]¶

Bases: gym.core.ActionWrapper

Action Wrapper to clip actions

Parameters:	env (object) – The environment whose actions need to be clipped

action(action: numpy.ndarray) → numpy.ndarray[source]¶

class genrl.environments.action_wrappers.RescaleAction(env: Union[gym.core.Env, genrl.environments.vec_env.vector_envs.VecEnv], low: int, high: int)[source]¶

Bases: gym.core.ActionWrapper

Action Wrapper to rescale actions

Parameters:	env (object) – The environment whose actions need to be rescaled low (int) – Lower limit of action high (int) – Upper limit of action

action(action: numpy.ndarray) → numpy.ndarray[source]¶

genrl.environments.atari_preprocessing module¶

class genrl.environments.atari_preprocessing.AtariPreprocessing(env: gym.core.Env, frameskip: Union[Tuple, int] = (2, 5), grayscale: bool = True, screen_size: int = 84)[source]¶

Bases: gym.core.Wrapper

Implementation for Image preprocessing for Gym Atari environments. Implements: 1) Frameskip 2) Grayscale 3) Downsampling to square image

param env: Atari environment

param frameskip:

Number of steps between actions. E.g. frameskip=4 will mean 1 action will be taken for every 4 frames. It’ll be a tuple

if non-deterministic and a random number will be chosen from (2, 5)

param grayscale:
	Whether or not the output should be converted to grayscale
param screen_size:
	Size of the output screen (square output)
type env:	Gym Environment
type frameskip:	tuple or int
type grayscale:	boolean
type screen_size:
	int

reset() → numpy.ndarray[source]¶

Resets state of environment

Returns:	Initial state
Return type:	NumPy array

step(action: numpy.ndarray) → numpy.ndarray[source]¶

Step through Atari environment for given action

Parameters:	action (NumPy array) – Action taken by agent
Returns:	Current state, reward(for frameskip number of actions), done, info

genrl.environments.atari_wrappers module¶

class genrl.environments.atari_wrappers.FireReset(env: gym.core.Env)[source]¶

Bases: gym.core.Wrapper

Some Atari environments do not actually do anything until a specific action (the fire action) is taken, so we make it take the action before starting the training process

Parameters:	env (Gym Environment) – Atari environment

reset() → numpy.ndarray[source]¶

Resets state of environment. Performs the noop action a random number of times to introduce stochasticity

Returns:	Initial state
Return type:	NumPy array

class genrl.environments.atari_wrappers.NoopReset(env: gym.core.Env, max_noops: int = 30)[source]¶

Bases: gym.core.Wrapper

Some Atari environments always reset to the same state. So we take a random number of some empty (noop) action to introduce some stochasticity.

Parameters:	env (Gym Environment) – Atari environment max_noops (int) – Maximum number of Noops to be taken

reset() → numpy.ndarray[source]¶

Resets state of environment. Performs the noop action a random number of times to introduce stochasticity

Returns:	Initial state
Return type:	NumPy array

step(action: numpy.ndarray) → numpy.ndarray[source]¶

Step through underlying Atari environment for given action

Parameters:	action (NumPy array) – Action taken by agent
Returns:	Current state, reward(for frameskip number of actions), done, info

genrl.environments.base_wrapper module¶

class genrl.environments.base_wrapper.BaseWrapper(env: Any, batch_size: int = None)[source]¶

Bases: abc.ABC

Base class for all wrappers

batch_size¶: The number of batches trained per update

close() → None[source]¶

Closes environment and performs any other cleanup

Must be overridden by subclasses

render() → None[source]¶: Render the environment

reset() → None[source]¶

Resets state of environment

Must be overriden by subclasses

Returns:	Initial state

seed(seed: int = None) → None[source]¶: Set seed for environment

step(action: numpy.ndarray) → None[source]¶

Step through the environment

Must be overriden by subclasses

genrl.environments.frame_stack module¶

class genrl.environments.frame_stack.FrameStack(env: gym.core.Env, framestack: int = 4, compress: bool = True)[source]¶

Bases: gym.core.Wrapper

Wrapper to stack the last few(4 by default) observations of agent efficiently

Parameters:	env (Gym Environment) – Environment to be wrapped framestack (int) – Number of frames to be stacked compress (bool) – True if we want to use LZ4 compression to conserve memory usage

reset() → numpy.ndarray[source]¶

Resets environment

Returns:	Initial state of environment
Return type:	NumPy Array

step(action: numpy.ndarray) → numpy.ndarray[source]¶

Steps through environment

Parameters:	action (NumPy Array) – Action taken by agent
Returns:	Next state, reward, done, info
Return type:	NumPy Array, float, boolean, dict

class genrl.environments.frame_stack.LazyFrames(frames: List[T], compress: bool = False)[source]¶

Bases: object

Efficient data structure to save each frame only once. Can use LZ4 compression to optimizer memory usage.

Parameters:	frames (collections.deque) – List of frames that needs to converted to a LazyFrames data structure compress (boolean) – True if we want to use LZ4 compression to conserve memory usage

shape¶: Returns dimensions of other object

genrl.environments.gym_wrapper module¶

class genrl.environments.gym_wrapper.GymWrapper(env: gym.core.Env)[source]¶

Bases: gym.core.Wrapper

Wrapper class for all Gym Environments

Parameters:	env (string) – Gym environment name n_envs (None, int) – Number of environments. None if not vectorised parallel (boolean) – If vectorised, should environments be run through serially or parallelly

action_shape¶

close() → None[source]¶: Closes environment

obs_shape¶

render(mode: str = 'human') → None[source]¶

Renders all envs in a tiles format similar to baselines.

Parameters:	mode (string) – Can either be ‘human’ or ‘rgb_array’. Displays tiled images in ‘human’ and returns tiled images in ‘rgb_array’

reset() → numpy.ndarray[source]¶

Resets environment

Returns:	Initial state

sample() → numpy.ndarray[source]¶

Shortcut method to directly sample from environment’s action space

Returns:	Random action from action space
Return type:	NumPy Array

seed(seed: int = None) → None[source]¶

Set environment seed

Parameters:	seed (int) – Value of seed

step(action: numpy.ndarray) → numpy.ndarray[source]¶

Steps the env through given action

Parameters:	action (NumPy array) – Action taken by agent
Returns:	Next observation, reward, game status and debugging info

genrl.environments.suite module¶

genrl.environments.suite.AtariEnv(env_id: str, wrapper_list: List[T] = [<class 'genrl.environments.atari_preprocessing.AtariPreprocessing'>, <class 'genrl.environments.atari_wrappers.NoopReset'>, <class 'genrl.environments.atari_wrappers.FireReset'>, <class 'genrl.environments.time_limit.AtariTimeLimit'>, <class 'genrl.environments.frame_stack.FrameStack'>]) → gym.core.Env[source]¶

Function to apply wrappers for all Atari envs by Trainer class

Parameters:	env (string) – Environment Name wrapper_list (list or tuple) – List of wrappers to use
Returns:	Gym Atari Environment
Return type:	object

genrl.environments.suite.GymEnv(env_id: str) → gym.core.Env[source]¶

Function to apply wrappers for all regular Gym envs by Trainer class

Parameters:	env (string) – Environment Name
Returns:	Gym Environment
Return type:	object

genrl.environments.suite.VectorEnv(env_id: str, n_envs: int = 2, parallel: int = False, env_type: str = 'gym') → genrl.environments.vec_env.vector_envs.VecEnv[source]¶

Chooses the kind of Vector Environment that is required

param env_id: Gym environment to be vectorised

param n_envs: Number of environments

param parallel: True if we want environments to run parallely and (

subprocesses, False if we want environments to run serially one after the other)

param env_type:	Type of environment. Currently, we support [“gym”, “atari”]
type env_id:	string
type n_envs:	int
type parallel:	False
type env_type:	string
returns:	Vector Environment
rtype:	object

genrl.environments.time_limit module¶

class genrl.environments.time_limit.AtariTimeLimit(env, max_episode_len=None)[source]¶

Bases: gym.core.Wrapper

reset(**kwargs)[source]¶

Resets the state of the environment and returns an initial observation.

Returns:	the initial observation.
Return type:	observation (object)

step(action)[source]¶

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:	action (object) – an action provided by the agent
Returns:	agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:	observation (object)

class genrl.environments.time_limit.TimeLimit(env, max_episode_len=None)[source]¶

Bases: gym.core.Wrapper

reset(**kwargs)[source]¶

Resets the state of the environment and returns an initial observation.

Returns:	the initial observation.
Return type:	observation (object)

step(action)[source]¶

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters:	action (object) – an action provided by the agent
Returns:	agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
Return type:	observation (object)

Environments¶

Subpackages¶

Submodules¶

genrl.environments.action_wrappers module¶

genrl.environments.atari_preprocessing module¶

genrl.environments.atari_wrappers module¶

genrl.environments.base_wrapper module¶

genrl.environments.frame_stack module¶

genrl.environments.gym_wrapper module¶

genrl.environments.suite module¶

genrl.environments.time_limit module¶

Module contents¶