mirror of
https://github.com/gryf/coach.git
synced 2025-12-17 19:20:19 +01:00
Add documentation on distributed Coach. (#158)
* Added documentation on distributed Coach.
This commit is contained in:
committed by
Gal Novik
parent
e3ecf445e2
commit
d06197f663
@@ -85,6 +85,7 @@
|
||||
<p class="caption"><span class="caption-text">Intro</span></p>
|
||||
<ul>
|
||||
<li class="toctree-l1"><a class="reference internal" href="usage.html">Usage</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="dist_usage.html">Usage - Distributed Coach</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="features/index.html">Features</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="selecting_an_algorithm.html">Selecting an Algorithm</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="dashboard.html">Coach Dashboard</a></li>
|
||||
@@ -93,6 +94,7 @@
|
||||
<ul>
|
||||
<li class="toctree-l1"><a class="reference internal" href="design/control_flow.html">Control Flow</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="design/network.html">Network Design</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="design/horizontal_scaling.html">Distributed Coach - Horizontal Scale-Out</a></li>
|
||||
</ul>
|
||||
<p class="caption"><span class="caption-text">Contributing</span></p>
|
||||
<ul>
|
||||
@@ -103,10 +105,13 @@
|
||||
<ul>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/agents/index.html">Agents</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/architectures/index.html">Architectures</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/data_stores/index.html">Data Stores</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/environments/index.html">Environments</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/exploration_policies/index.html">Exploration Policies</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/filters/index.html">Filters</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/memories/index.html">Memories</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/memory_backends/index.html">Memory Backends</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/orchestrators/index.html">Orchestrators</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/core_types.html">Core Types</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/spaces.html">Spaces</a></li>
|
||||
<li class="toctree-l1"><a class="reference internal" href="components/additional_parameters.html">Additional Parameters</a></li>
|
||||
@@ -237,6 +242,23 @@ training or testing.</p>
|
||||
</table>
|
||||
</dd></dl>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.collect_savers">
|
||||
<code class="descname">collect_savers</code><span class="sig-paren">(</span><em>parent_path_suffix: str</em><span class="sig-paren">)</span> → rl_coach.saver.SaverCollection<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.collect_savers" title="Permalink to this definition">¶</a></dt>
|
||||
<dd><p>Collect all of agent’s network savers
|
||||
:param parent_path_suffix: path suffix of the parent of the agent</p>
|
||||
<blockquote>
|
||||
<div>(could be name of level manager or composite agent)</div></blockquote>
|
||||
<table class="docutils field-list" frame="void" rules="none">
|
||||
<col class="field-name" />
|
||||
<col class="field-body" />
|
||||
<tbody valign="top">
|
||||
<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">collection of all agent savers</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</dd></dl>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.create_networks">
|
||||
<code class="descname">create_networks</code><span class="sig-paren">(</span><span class="sig-paren">)</span> → Dict[str, rl_coach.architectures.network_wrapper.NetworkWrapper]<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.create_networks" title="Permalink to this definition">¶</a></dt>
|
||||
@@ -253,6 +275,26 @@ for creating the network.</p>
|
||||
</table>
|
||||
</dd></dl>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.emulate_act_on_trainer">
|
||||
<code class="descname">emulate_act_on_trainer</code><span class="sig-paren">(</span><em>transition: rl_coach.core_types.Transition</em><span class="sig-paren">)</span> → rl_coach.core_types.ActionInfo<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.emulate_act_on_trainer" title="Permalink to this definition">¶</a></dt>
|
||||
<dd><p>This emulates the act using the transition obtained from the rollout worker on the training worker
|
||||
in case of distributed training.
|
||||
Given the agents current knowledge, decide on the next action to apply to the environment
|
||||
:return: an action and a dictionary containing any additional info from the action decision process</p>
|
||||
</dd></dl>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.emulate_observe_on_trainer">
|
||||
<code class="descname">emulate_observe_on_trainer</code><span class="sig-paren">(</span><em>transition: rl_coach.core_types.Transition</em><span class="sig-paren">)</span> → bool<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.emulate_observe_on_trainer" title="Permalink to this definition">¶</a></dt>
|
||||
<dd><p>This emulates the observe using the transition obtained from the rollout worker on the training worker
|
||||
in case of distributed training.
|
||||
Given a response from the environment, distill the observation from it and store it for later use.
|
||||
The response should be a dictionary containing the performed action, the new observation and measurements,
|
||||
the reward, a game over flag and any additional information necessary.
|
||||
:return:</p>
|
||||
</dd></dl>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.get_predictions">
|
||||
<code class="descname">get_predictions</code><span class="sig-paren">(</span><em>states: List[Dict[str, numpy.ndarray]], prediction_type: rl_coach.core_types.PredictionType</em><span class="sig-paren">)</span><a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.get_predictions" title="Permalink to this definition">¶</a></dt>
|
||||
@@ -492,6 +534,22 @@ by val, and by the current phase set in self.phase.</p>
|
||||
</table>
|
||||
</dd></dl>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.restore_checkpoint">
|
||||
<code class="descname">restore_checkpoint</code><span class="sig-paren">(</span><em>checkpoint_dir: str</em><span class="sig-paren">)</span> → None<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.restore_checkpoint" title="Permalink to this definition">¶</a></dt>
|
||||
<dd><p>Allows agents to store additional information when saving checkpoints.</p>
|
||||
<table class="docutils field-list" frame="void" rules="none">
|
||||
<col class="field-name" />
|
||||
<col class="field-body" />
|
||||
<tbody valign="top">
|
||||
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>checkpoint_dir</strong> – The checkpoint dir to restore from</td>
|
||||
</tr>
|
||||
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body">None</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</dd></dl>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.run_pre_network_filter_for_inference">
|
||||
<code class="descname">run_pre_network_filter_for_inference</code><span class="sig-paren">(</span><em>state: Dict[str, numpy.ndarray]</em><span class="sig-paren">)</span> → Dict[str, numpy.ndarray]<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.run_pre_network_filter_for_inference" title="Permalink to this definition">¶</a></dt>
|
||||
@@ -510,13 +568,13 @@ by val, and by the current phase set in self.phase.</p>
|
||||
|
||||
<dl class="method">
|
||||
<dt id="rl_coach.agents.dqn_agent.DQNAgent.save_checkpoint">
|
||||
<code class="descname">save_checkpoint</code><span class="sig-paren">(</span><em>checkpoint_id: int</em><span class="sig-paren">)</span> → None<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.save_checkpoint" title="Permalink to this definition">¶</a></dt>
|
||||
<code class="descname">save_checkpoint</code><span class="sig-paren">(</span><em>checkpoint_prefix: str</em><span class="sig-paren">)</span> → None<a class="headerlink" href="#rl_coach.agents.dqn_agent.DQNAgent.save_checkpoint" title="Permalink to this definition">¶</a></dt>
|
||||
<dd><p>Allows agents to store additional information when saving checkpoints.</p>
|
||||
<table class="docutils field-list" frame="void" rules="none">
|
||||
<col class="field-name" />
|
||||
<col class="field-body" />
|
||||
<tbody valign="top">
|
||||
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>checkpoint_id</strong> – the id of the checkpoint</td>
|
||||
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>checkpoint_prefix</strong> – The prefix of the checkpoint file to save</td>
|
||||
</tr>
|
||||
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body">None</td>
|
||||
</tr>
|
||||
|
||||
Reference in New Issue
Block a user