coach/docs_raw/docs/contributing/add_agent.md at 42a9ec132d39d0b381a8dbdcc1c1be4caf3a90a8

gryf/coach

Fork 0

mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00

Files

itaicaspi-intel 5d5562bf62 moving the docs to github

2018-04-23 09:14:20 +03:00

2.4 KiB

Raw Blame History

Coach's modularity makes adding an agent a simple and clean task, that involves the following steps:

Implement your algorithm in a new file under the agents directory. The agent can inherit base classes such as ValueOptimizationAgent or ActorCriticAgent, or the more generic Agent base class.

ValueOptimizationAgent, PolicyOptimizationAgent and Agent are abstract classes. learn_from_batch() should be overriden with the desired behavior for the algorithm being implemented. If deciding to inherit from Agent, also choose_action() should be overriden.

  def learn_from_batch(self, batch):
      """
      Given a batch of transitions, calculates their target values and updates the network.
      :param batch: A list of transitions
      :return: The loss of the training
      """
      pass

  def choose_action(self, curr_state, phase=RunPhase.TRAIN):
      """
      choose an action to act with in the current episode being played. Different behavior might be exhibited when training
       or testing.

      :param curr_state: the current state to act upon.  
      :param phase: the current phase: training or testing.
      :return: chosen action, some action value describing the action (q-value, probability, etc)
      """
      pass

Make sure to add your new agent to agents/__init__.py

Implement your agent's specific network head, if needed, at the implementation for the framework of your choice. For example architectures/neon_components/heads.py. The head will inherit the generic base class Head. A new output type should be added to configurations.py, and a mapping between the new head and output type should be defined in the get_output_head() function at architectures/neon_components/general_network.py
Define a new configuration class at configurations.py, which includes the new agent name in the type field, the new output type in the output_types field, and assigning default values to hyperparameters.
(Optional) Define a preset using the new agent type with a given environment, and the hyperparameters that should be used for training on that environment.

2.4 KiB Raw Blame History

2.4 KiB

Raw Blame History