Environments¶
Subpackages¶
Submodules¶
genrl.environments.action_wrappers module¶
-
class
genrl.environments.action_wrappers.
ClipAction
(env: Union[gym.core.Env, genrl.environments.vec_env.vector_envs.VecEnv])[source]¶ Bases:
gym.core.ActionWrapper
Action Wrapper to clip actions
Parameters: env (object) – The environment whose actions need to be clipped
-
class
genrl.environments.action_wrappers.
RescaleAction
(env: Union[gym.core.Env, genrl.environments.vec_env.vector_envs.VecEnv], low: int, high: int)[source]¶ Bases:
gym.core.ActionWrapper
Action Wrapper to rescale actions
Parameters: - env (object) – The environment whose actions need to be rescaled
- low (int) – Lower limit of action
- high (int) – Upper limit of action
genrl.environments.atari_preprocessing module¶
-
class
genrl.environments.atari_preprocessing.
AtariPreprocessing
(env: gym.core.Env, frameskip: Union[Tuple, int] = (2, 5), grayscale: bool = True, screen_size: int = 84)[source]¶ Bases:
gym.core.Wrapper
Implementation for Image preprocessing for Gym Atari environments. Implements: 1) Frameskip 2) Grayscale 3) Downsampling to square image
param env: Atari environment param frameskip: Number of steps between actions. E.g. frameskip=4 will mean 1 action will be taken for every 4 frames. It’ll be a tuple - if non-deterministic and a random number will be chosen from (2, 5)
param grayscale: Whether or not the output should be converted to grayscale param screen_size: Size of the output screen (square output) type env: Gym Environment type frameskip: tuple or int type grayscale: boolean type screen_size: int
genrl.environments.atari_wrappers module¶
-
class
genrl.environments.atari_wrappers.
FireReset
(env: gym.core.Env)[source]¶ Bases:
gym.core.Wrapper
Some Atari environments do not actually do anything until a specific action (the fire action) is taken, so we make it take the action before starting the training process
Parameters: env (Gym Environment) – Atari environment
-
class
genrl.environments.atari_wrappers.
NoopReset
(env: gym.core.Env, max_noops: int = 30)[source]¶ Bases:
gym.core.Wrapper
Some Atari environments always reset to the same state. So we take a random number of some empty (noop) action to introduce some stochasticity.
Parameters: - env (Gym Environment) – Atari environment
- max_noops (int) – Maximum number of Noops to be taken
genrl.environments.base_wrapper module¶
-
class
genrl.environments.base_wrapper.
BaseWrapper
(env: Any, batch_size: int = None)[source]¶ Bases:
abc.ABC
Base class for all wrappers
-
batch_size
¶ The number of batches trained per update
-
close
() → None[source]¶ Closes environment and performs any other cleanup
Must be overridden by subclasses
-
genrl.environments.frame_stack module¶
-
class
genrl.environments.frame_stack.
FrameStack
(env: gym.core.Env, framestack: int = 4, compress: bool = True)[source]¶ Bases:
gym.core.Wrapper
Wrapper to stack the last few(4 by default) observations of agent efficiently
Parameters: - env (Gym Environment) – Environment to be wrapped
- framestack (int) – Number of frames to be stacked
- compress (bool) – True if we want to use LZ4 compression to conserve memory usage
-
class
genrl.environments.frame_stack.
LazyFrames
(frames: List[T], compress: bool = False)[source]¶ Bases:
object
Efficient data structure to save each frame only once. Can use LZ4 compression to optimizer memory usage.
Parameters: - frames (collections.deque) – List of frames that needs to converted to a LazyFrames data structure
- compress (boolean) – True if we want to use LZ4 compression to conserve memory usage
-
shape
¶ Returns dimensions of other object
genrl.environments.gym_wrapper module¶
-
class
genrl.environments.gym_wrapper.
GymWrapper
(env: gym.core.Env)[source]¶ Bases:
gym.core.Wrapper
Wrapper class for all Gym Environments
Parameters: - env (string) – Gym environment name
- n_envs (None, int) – Number of environments. None if not vectorised
- parallel (boolean) – If vectorised, should environments be run through serially or parallelly
-
action_shape
¶
-
obs_shape
¶
-
render
(mode: str = 'human') → None[source]¶ Renders all envs in a tiles format similar to baselines.
Parameters: mode (string) – Can either be ‘human’ or ‘rgb_array’. Displays tiled images in ‘human’ and returns tiled images in ‘rgb_array’
genrl.environments.suite module¶
-
genrl.environments.suite.
AtariEnv
(env_id: str, wrapper_list: List[T] = [<class 'genrl.environments.atari_preprocessing.AtariPreprocessing'>, <class 'genrl.environments.atari_wrappers.NoopReset'>, <class 'genrl.environments.atari_wrappers.FireReset'>, <class 'genrl.environments.time_limit.AtariTimeLimit'>, <class 'genrl.environments.frame_stack.FrameStack'>]) → gym.core.Env[source]¶ Function to apply wrappers for all Atari envs by Trainer class
Parameters: - env (string) – Environment Name
- wrapper_list (list or tuple) – List of wrappers to use
Returns: Gym Atari Environment
Return type: object
-
genrl.environments.suite.
GymEnv
(env_id: str) → gym.core.Env[source]¶ Function to apply wrappers for all regular Gym envs by Trainer class
Parameters: env (string) – Environment Name Returns: Gym Environment Return type: object
-
genrl.environments.suite.
VectorEnv
(env_id: str, n_envs: int = 2, parallel: int = False, env_type: str = 'gym') → genrl.environments.vec_env.vector_envs.VecEnv[source]¶ Chooses the kind of Vector Environment that is required
param env_id: Gym environment to be vectorised param n_envs: Number of environments param parallel: True if we want environments to run parallely and ( - subprocesses, False if we want environments to run serially one after the other)
param env_type: Type of environment. Currently, we support [“gym”, “atari”] type env_id: string type n_envs: int type parallel: False type env_type: string returns: Vector Environment rtype: object
genrl.environments.time_limit module¶
-
class
genrl.environments.time_limit.
AtariTimeLimit
(env, max_episode_len=None)[source]¶ Bases:
gym.core.Wrapper
-
reset
(**kwargs)[source]¶ Resets the state of the environment and returns an initial observation.
Returns: the initial observation. Return type: observation (object)
-
step
(action)[source]¶ Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
Parameters: action (object) – an action provided by the agent Returns: agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning) Return type: observation (object)
-
-
class
genrl.environments.time_limit.
TimeLimit
(env, max_episode_len=None)[source]¶ Bases:
gym.core.Wrapper
-
reset
(**kwargs)[source]¶ Resets the state of the environment and returns an initial observation.
Returns: the initial observation. Return type: observation (object)
-
step
(action)[source]¶ Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
Parameters: action (object) – an action provided by the agent Returns: agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning) Return type: observation (object)
-