1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 11:40:18 +01:00

Add documentation on distributed Coach. (#158)

* Added documentation on distributed Coach.
This commit is contained in:
Balaji Subramaniam
2018-11-27 02:26:15 -08:00
committed by Gal Novik
parent e3ecf445e2
commit d06197f663
151 changed files with 5302 additions and 643 deletions

View File

@@ -1,7 +1,7 @@
Usage
=====
One of the mechanism Coach uses for running experiments is the **Preset** mechanism.
One of the mechanisms Coach uses for running experiments is the **Preset** mechanism.
As its name implies, a preset defines a set of predefined experiment parameters.
This allows defining a *complex* agent-environment interaction, with multiple parameters, and later running it through
a very *simple* command line.
@@ -29,7 +29,7 @@ To list the available presets, use the `-l` flag.
Multi-threaded Algorithms
+++++++++++++++++++++++++
Multi-threaded algorithms are very common this days.
Multi-threaded algorithms are very common these days.
They typically achieve the best results, and scale gracefully with the number of threads.
In Coach, running such algorithms is done by selecting a suitable preset, and choosing the number of threads to run using the :code:`-n` flag.
@@ -39,6 +39,20 @@ In Coach, running such algorithms is done by selecting a suitable preset, and ch
coach -p CartPole_A3C -n 8
Multi-Node Algorithms
+++++++++++++++++++++++++
Coach supports the multi-node runs in distributed mode. Specifically, the horizontal scale-out of rollout workers is implemented.
In Coach, running such algorithms is done by selecting a suitable preset, enabling distributed coach using :code:`-dc` flag,
passing distributed coach parameters using :code:`dcp` and choosing the number of to run using the :code:`-n` flag.
For more details and instructions on how to use distributed Coach, see :ref:`dist-coach-usage`.
*Example:*
.. code-block:: python
coach -p CartPole_ClippedPPO -dc -dcp <path-to-config-file> -n 8
Evaluating an Agent
-------------------
@@ -155,4 +169,4 @@ The most up to date description can be found by using the :code:`-h` flag.
.. argparse::
:module: rl_coach.coach
:func: create_argument_parser
:prog: coach
:prog: coach