Q-Learning¶
genrl.agents.classical.qlearning.qlearning module¶
-
class
genrl.agents.classical.qlearning.qlearning.
QLearning
(env: gym.core.Env, epsilon: float = 0.9, gamma: float = 0.95, lr: float = 0.01)[source]¶ Bases:
object
Q-Learning Algorithm.
Paper- https://link.springer.com/article/10.1007/BF00992698
-
env
¶ Environment with which agent interacts.
Type: gym.Env
-
epsilon
¶ exploration coefficient for epsilon-greedy exploration.
Type: float, optional
-
gamma
¶ discount factor.
Type: float, optional
-
lr
¶ learning rate for optimizer.
Type: float, optional
-
get_action
(state: numpy.ndarray, explore: bool = True) → numpy.ndarray[source]¶ Epsilon greedy selection of epsilon in the explore phase.
Parameters: - state (np.ndarray) – Environment state.
- explore (bool, optional) – True if exploration is required. False if not.
Returns: action.
Return type: np.ndarray
-