Vectorized Envrionments¶

Submodules¶

genrl.environments.vec_env.monitor module¶

class genrl.environments.vec_env.monitor.VecMonitor(venv: genrl.environments.vec_env.vector_envs.VecEnv, history_length: int = 0, info_keys: Tuple = ())[source]¶

Bases: genrl.environments.vec_env.wrappers.VecEnvWrapper

Monitor class for VecEnvs. Saves important variables into the info dictionary

Parameters:	venv (object) – Vectorized Environment history_length (int) – Length of history for episode rewards and episode lengths info_keys (tuple or list) – Important variables to save

reset() → numpy.ndarray[source]¶

Resets Vectorized Environment

Returns:	Initial observations
Return type:	Numpy Array

step(actions: numpy.ndarray) → Tuple[source]¶

Steps through all the environments and records important information

Parameters:	actions (Numpy Array) – Actions to be taken for the Vectorized Environment
Returns:	States, rewards, dones, infos

genrl.environments.vec_env.normalize module¶

class genrl.environments.vec_env.normalize.VecNormalize(venv: genrl.environments.vec_env.vector_envs.VecEnv, norm_obs: bool = True, norm_reward: bool = True, clip_reward: float = 20.0)[source]¶

Bases: genrl.environments.vec_env.wrappers.VecEnvWrapper

Wrapper to implement Normalization of observations and rewards for VecEnvs

Parameters:	venv (Vectorized Environment) – The Vectorized environment n_envs (int) – Number of environments in VecEnv norm_obs (bool) – True if observations should be normalized, else False norm_reward (bool) – True if rewards should be normalized, else False clip_reward (float) – Maximum absolute value for rewards

close()[source]¶: Close all individual environments in the Vectorized Environment

reset() → numpy.ndarray[source]¶

Resets Vectorized Environment

Returns:	Initial observations
Return type:	Numpy Array

step(actions: numpy.ndarray) → Tuple[source]¶

Steps through all the environments and normalizes the observations and rewards (if enabled)

Parameters:	actions (Numpy Array) – Actions to be taken for the Vectorized Environment
Returns:	States, rewards, dones, infos

genrl.environments.vec_env.utils module¶

class genrl.environments.vec_env.utils.RunningMeanStd(epsilon: float = 0.0001, shape: Tuple = ())[source]¶

Bases: object

Utility Function to compute a running mean and variance calculator

Parameters:	epsilon (float) – Small number to prevent division by zero for calculations shape (Tuple) – Shape of the RMS object

update(batch: torch.Tensor)[source]¶

genrl.environments.vec_env.vector_envs module¶

class genrl.environments.vec_env.vector_envs.SerialVecEnv(*args, **kwargs)[source]¶

Bases: genrl.environments.vec_env.vector_envs.VecEnv

Constructs a wrapper for serial execution through envs.

close()[source]¶: Closes all envs

get_spaces()[source]¶

images() → List[T][source]¶: Returns an array of images from each env render

render(mode='human')[source]¶

Renders all envs in a tiles format similar to baselines

Parameters:	mode (string) – (Can either be ‘human’ or ‘rgb_array’. Displays tiled images in ‘human’ and returns tiled images in ‘rgb_array’)

reset() → torch.Tensor[source]¶: Resets all envs

reset_single_env(i: int) → torch.Tensor[source]¶: Resets single environment

step(actions: torch.Tensor) → Tuple[source]¶

Steps through all envs serially

Parameters:	actions (Iterable of ints/floats) – Actions from the model

class genrl.environments.vec_env.vector_envs.SubProcessVecEnv(*args, **kwargs)[source]¶

Bases: genrl.environments.vec_env.vector_envs.VecEnv

Constructs a wrapper for parallel execution through envs.

close()[source]¶: Closes all environments and processes

get_spaces() → Tuple[source]¶: Returns state and action spaces of environments

reset() → torch.Tensor[source]¶

Resets environments

Returns:	States after environment reset

seed(seed: int = None)[source]¶: Sets seed for reproducability

step(actions: torch.Tensor) → Tuple[source]¶

Steps through environments serially

Parameters:	actions (Iterable of ints/floats) – Actions from the model

class genrl.environments.vec_env.vector_envs.VecEnv(envs: List[T], n_envs: int = 2)[source]¶

Bases: abc.ABC

Base class for multiple environments.

Parameters:	env (Gym Environment) – Gym environment to be vectorised n_envs (int) – Number of environments

action_shape¶

action_spaces¶

close()[source]¶

n_envs¶

obs_shape¶

observation_spaces¶

reset()[source]¶

sample() → List[T][source]¶: Return samples of actions from each environment

seed(seed: int)[source]¶: Set seed for reproducibility in all environments

step(actions)[source]¶

genrl.environments.vec_env.vector_envs.worker(parent_conn: multiprocessing.context.BaseContext.Pipe, child_conn: multiprocessing.context.BaseContext.Pipe, env: gym.core.Env)[source]¶

Worker class to facilitate multiprocessing

Parameters:	parent_conn (Multiprocessing Pipe Connection) – Parent connection of Pipe child_conn (Multiprocessing Pipe Connection) – Child connection of Pipe env (Gym Environment) – Gym environment we need multiprocessing for