mirror of
https://github.com/gryf/coach.git
synced 2025-12-18 03:30:19 +01:00
Enabling Coach Documentation to be run even when environments are not installed (#326)
This commit is contained in:
@@ -8,7 +8,7 @@
|
||||
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
|
||||
<title>Adding a New Agent — Reinforcement Learning Coach 0.11.0 documentation</title>
|
||||
<title>Adding a New Agent — Reinforcement Learning Coach 0.12.1 documentation</title>
|
||||
|
||||
|
||||
|
||||
@@ -17,13 +17,21 @@
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript" src="../_static/js/modernizr.min.js"></script>
|
||||
|
||||
|
||||
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
|
||||
<script type="text/javascript" src="../_static/jquery.js"></script>
|
||||
<script type="text/javascript" src="../_static/underscore.js"></script>
|
||||
<script type="text/javascript" src="../_static/doctools.js"></script>
|
||||
<script type="text/javascript" src="../_static/language_data.js"></script>
|
||||
<script async="async" type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS-MML_HTMLorMML"></script>
|
||||
|
||||
<script type="text/javascript" src="../_static/js/theme.js"></script>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
|
||||
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
|
||||
<link rel="stylesheet" href="../_static/css/custom.css" type="text/css" />
|
||||
@@ -33,21 +41,16 @@
|
||||
<link rel="prev" title="Distributed Coach - Horizontal Scale-Out" href="../design/horizontal_scaling.html" />
|
||||
<link href="../_static/css/custom.css" rel="stylesheet" type="text/css">
|
||||
|
||||
|
||||
|
||||
<script src="../_static/js/modernizr.min.js"></script>
|
||||
|
||||
</head>
|
||||
|
||||
<body class="wy-body-for-nav">
|
||||
|
||||
|
||||
<div class="wy-grid-for-nav">
|
||||
|
||||
|
||||
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||||
<div class="wy-side-scroll">
|
||||
<div class="wy-side-nav-search">
|
||||
<div class="wy-side-nav-search" >
|
||||
|
||||
|
||||
|
||||
@@ -188,11 +191,11 @@ We suggest using the following
|
||||
<a class="reference external" href="https://github.com/NervanaSystems/coach/blob/master/tutorials/1.%20Implementing%20an%20Algorithm.ipynb">Jupyter notebook tutorial</a>
|
||||
to ramp up on this process. In general, it involves the following steps:</p>
|
||||
<ol class="arabic">
|
||||
<li><p class="first">Implement your algorithm in a new file. The agent can inherit base classes such as <strong>ValueOptimizationAgent</strong> or
|
||||
<li><p>Implement your algorithm in a new file. The agent can inherit base classes such as <strong>ValueOptimizationAgent</strong> or
|
||||
<strong>ActorCriticAgent</strong>, or the more generic <strong>Agent</strong> base class.</p>
|
||||
<div class="admonition note">
|
||||
<p class="first admonition-title">Note</p>
|
||||
<p class="last"><strong>ValueOptimizationAgent</strong>, <strong>PolicyOptimizationAgent</strong> and <strong>Agent</strong> are abstract classes.
|
||||
<p class="admonition-title">Note</p>
|
||||
<p><strong>ValueOptimizationAgent</strong>, <strong>PolicyOptimizationAgent</strong> and <strong>Agent</strong> are abstract classes.
|
||||
<code class="code docutils literal notranslate"><span class="pre">learn_from_batch()</span></code> should be overriden with the desired behavior for the algorithm being implemented.
|
||||
If deciding to inherit from <strong>Agent</strong>, also <code class="code docutils literal notranslate"><span class="pre">choose_action()</span></code> should be overriden.</p>
|
||||
</div>
|
||||
@@ -214,25 +217,24 @@ If deciding to inherit from <strong>Agent</strong>, also <code class="code docut
|
||||
</pre></div>
|
||||
</div>
|
||||
</li>
|
||||
<li><p class="first">Implement your agent’s specific network head, if needed, at the implementation for the framework of your choice.
|
||||
<li><p>Implement your agent’s specific network head, if needed, at the implementation for the framework of your choice.
|
||||
For example <strong>architectures/neon_components/heads.py</strong>. The head will inherit the generic base class Head.
|
||||
A new output type should be added to configurations.py, and a mapping between the new head and output type should
|
||||
be defined in the get_output_head() function at <strong>architectures/neon_components/general_network.py</strong></p>
|
||||
</li>
|
||||
<li><p class="first">Define a new parameters class that inherits AgentParameters.
|
||||
be defined in the get_output_head() function at <strong>architectures/neon_components/general_network.py</strong></p></li>
|
||||
<li><p>Define a new parameters class that inherits AgentParameters.
|
||||
The parameters class defines all the hyperparameters for the agent, and is initialized with 4 main components:</p>
|
||||
<ul class="simple">
|
||||
<li><strong>algorithm</strong>: A class inheriting AlgorithmParameters which defines any algorithm specific parameters</li>
|
||||
<li><strong>exploration</strong>: A class inheriting ExplorationParameters which defines the exploration policy parameters.
|
||||
<li><p><strong>algorithm</strong>: A class inheriting AlgorithmParameters which defines any algorithm specific parameters</p></li>
|
||||
<li><p><strong>exploration</strong>: A class inheriting ExplorationParameters which defines the exploration policy parameters.
|
||||
There are several common exploration policies built-in which you can use, and are defined under
|
||||
the exploration sub directory. You can also define your own custom exploration policy.</li>
|
||||
<li><strong>memory</strong>: A class inheriting MemoryParameters which defined the memory parameters.
|
||||
the exploration sub directory. You can also define your own custom exploration policy.</p></li>
|
||||
<li><p><strong>memory</strong>: A class inheriting MemoryParameters which defined the memory parameters.
|
||||
There are several common memory types built-in which you can use, and are defined under the memories
|
||||
sub directory. You can also define your own custom memory.</li>
|
||||
<li><strong>networks</strong>: A dictionary defining all the networks that will be used by the agent. The keys of the dictionary
|
||||
sub directory. You can also define your own custom memory.</p></li>
|
||||
<li><p><strong>networks</strong>: A dictionary defining all the networks that will be used by the agent. The keys of the dictionary
|
||||
define the network name and will be used to access each network through the agent class.
|
||||
The dictionary values are a class inheriting NetworkParameters, which define the network structure
|
||||
and parameters.</li>
|
||||
and parameters.</p></li>
|
||||
</ul>
|
||||
<p>Additionally, set the path property to return the path to your agent class in the following format:</p>
|
||||
<p><code class="code docutils literal notranslate"><span class="pre"><path</span> <span class="pre">to</span> <span class="pre">python</span> <span class="pre">module>:<name</span> <span class="pre">of</span> <span class="pre">agent</span> <span class="pre">class></span></code></p>
|
||||
@@ -250,9 +252,8 @@ and parameters.</li>
|
||||
</pre></div>
|
||||
</div>
|
||||
</li>
|
||||
<li><p class="first">(Optional) Define a preset using the new agent type with a given environment, and the hyper-parameters that should
|
||||
be used for training on that environment.</p>
|
||||
</li>
|
||||
<li><p>(Optional) Define a preset using the new agent type with a given environment, and the hyper-parameters that should
|
||||
be used for training on that environment.</p></li>
|
||||
</ol>
|
||||
</div>
|
||||
|
||||
@@ -267,7 +268,7 @@ be used for training on that environment.</p>
|
||||
<a href="add_env.html" class="btn btn-neutral float-right" title="Adding a New Environment" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
|
||||
|
||||
|
||||
<a href="../design/horizontal_scaling.html" class="btn btn-neutral" title="Distributed Coach - Horizontal Scale-Out" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
|
||||
<a href="../design/horizontal_scaling.html" class="btn btn-neutral float-left" title="Distributed Coach - Horizontal Scale-Out" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
|
||||
|
||||
</div>
|
||||
|
||||
@@ -276,7 +277,7 @@ be used for training on that environment.</p>
|
||||
|
||||
<div role="contentinfo">
|
||||
<p>
|
||||
© Copyright 2018, Intel AI Lab
|
||||
© Copyright 2018-2019, Intel AI Lab
|
||||
|
||||
</p>
|
||||
</div>
|
||||
@@ -293,27 +294,16 @@ be used for training on that environment.</p>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
|
||||
<script type="text/javascript" src="../_static/jquery.js"></script>
|
||||
<script type="text/javascript" src="../_static/underscore.js"></script>
|
||||
<script type="text/javascript" src="../_static/doctools.js"></script>
|
||||
<script type="text/javascript" src="../_static/language_data.js"></script>
|
||||
<script async="async" type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS-MML_HTMLorMML"></script>
|
||||
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript" src="../_static/js/theme.js"></script>
|
||||
|
||||
<script type="text/javascript">
|
||||
jQuery(function () {
|
||||
SphinxRtdTheme.Navigation.enable(true);
|
||||
});
|
||||
</script>
|
||||
</script>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
</body>
|
||||
</html>
|
||||
@@ -8,7 +8,7 @@
|
||||
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
|
||||
<title>Adding a New Environment — Reinforcement Learning Coach 0.11.0 documentation</title>
|
||||
<title>Adding a New Environment — Reinforcement Learning Coach 0.12.1 documentation</title>
|
||||
|
||||
|
||||
|
||||
@@ -17,13 +17,21 @@
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript" src="../_static/js/modernizr.min.js"></script>
|
||||
|
||||
|
||||
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
|
||||
<script type="text/javascript" src="../_static/jquery.js"></script>
|
||||
<script type="text/javascript" src="../_static/underscore.js"></script>
|
||||
<script type="text/javascript" src="../_static/doctools.js"></script>
|
||||
<script type="text/javascript" src="../_static/language_data.js"></script>
|
||||
<script async="async" type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS-MML_HTMLorMML"></script>
|
||||
|
||||
<script type="text/javascript" src="../_static/js/theme.js"></script>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
|
||||
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
|
||||
<link rel="stylesheet" href="../_static/css/custom.css" type="text/css" />
|
||||
@@ -33,21 +41,16 @@
|
||||
<link rel="prev" title="Adding a New Agent" href="add_agent.html" />
|
||||
<link href="../_static/css/custom.css" rel="stylesheet" type="text/css">
|
||||
|
||||
|
||||
|
||||
<script src="../_static/js/modernizr.min.js"></script>
|
||||
|
||||
</head>
|
||||
|
||||
<body class="wy-body-for-nav">
|
||||
|
||||
|
||||
<div class="wy-grid-for-nav">
|
||||
|
||||
|
||||
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||||
<div class="wy-side-scroll">
|
||||
<div class="wy-side-nav-search">
|
||||
<div class="wy-side-nav-search" >
|
||||
|
||||
|
||||
|
||||
@@ -209,9 +212,8 @@ As an alternative, we highly recommend following the corresponding
|
||||
<a class="reference external" href="https://github.com/NervanaSystems/coach/blob/master/tutorials/2.%20Adding%20an%20Environment.ipynb">tutorial</a>
|
||||
in the GitHub repo.</p>
|
||||
<ol class="arabic">
|
||||
<li><p class="first">Create a new class for your environment, and inherit the Environment class.</p>
|
||||
</li>
|
||||
<li><p class="first">Coach defines a simple API for implementing a new environment, which are defined in environment/environment.py.
|
||||
<li><p>Create a new class for your environment, and inherit the Environment class.</p></li>
|
||||
<li><p>Coach defines a simple API for implementing a new environment, which are defined in environment/environment.py.
|
||||
There are several functions to implement, but only some of them are mandatory.</p>
|
||||
<p>Here are the important ones:</p>
|
||||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">_take_action</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">action_idx</span><span class="p">:</span> <span class="n">ActionType</span><span class="p">)</span> <span class="o">-></span> <span class="bp">None</span><span class="p">:</span>
|
||||
@@ -250,7 +252,7 @@ There are several functions to implement, but only some of them are mandatory.</
|
||||
</pre></div>
|
||||
</div>
|
||||
</li>
|
||||
<li><p class="first">Create a new parameters class for your environment, which inherits the EnvironmentParameters class.
|
||||
<li><p>Create a new parameters class for your environment, which inherits the EnvironmentParameters class.
|
||||
In the __init__ of your class, define all the parameters you used in your Environment class.
|
||||
Additionally, fill the path property of the class with the path to your Environment class.
|
||||
For example, take a look at the EnvironmentParameters class used for Doom:</p>
|
||||
@@ -269,8 +271,7 @@ For example, take a look at the EnvironmentParameters class used for Doom:</p>
|
||||
</div>
|
||||
</div></blockquote>
|
||||
</li>
|
||||
<li><p class="first">And that’s it, you’re done. Now just add a new preset with your newly created environment, and start training an agent on top of it.</p>
|
||||
</li>
|
||||
<li><p>And that’s it, you’re done. Now just add a new preset with your newly created environment, and start training an agent on top of it.</p></li>
|
||||
</ol>
|
||||
</div>
|
||||
</div>
|
||||
@@ -286,7 +287,7 @@ For example, take a look at the EnvironmentParameters class used for Doom:</p>
|
||||
<a href="../components/agents/index.html" class="btn btn-neutral float-right" title="Agents" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
|
||||
|
||||
|
||||
<a href="add_agent.html" class="btn btn-neutral" title="Adding a New Agent" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
|
||||
<a href="add_agent.html" class="btn btn-neutral float-left" title="Adding a New Agent" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
|
||||
|
||||
</div>
|
||||
|
||||
@@ -295,7 +296,7 @@ For example, take a look at the EnvironmentParameters class used for Doom:</p>
|
||||
|
||||
<div role="contentinfo">
|
||||
<p>
|
||||
© Copyright 2018, Intel AI Lab
|
||||
© Copyright 2018-2019, Intel AI Lab
|
||||
|
||||
</p>
|
||||
</div>
|
||||
@@ -312,27 +313,16 @@ For example, take a look at the EnvironmentParameters class used for Doom:</p>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
|
||||
<script type="text/javascript" src="../_static/jquery.js"></script>
|
||||
<script type="text/javascript" src="../_static/underscore.js"></script>
|
||||
<script type="text/javascript" src="../_static/doctools.js"></script>
|
||||
<script type="text/javascript" src="../_static/language_data.js"></script>
|
||||
<script async="async" type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS-MML_HTMLorMML"></script>
|
||||
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript" src="../_static/js/theme.js"></script>
|
||||
|
||||
<script type="text/javascript">
|
||||
jQuery(function () {
|
||||
SphinxRtdTheme.Navigation.enable(true);
|
||||
});
|
||||
</script>
|
||||
</script>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
</body>
|
||||
</html>
|
||||
Reference in New Issue
Block a user