update of api docstrings across coach and tutorials [WIP] (#91)

* updating the documentation website * adding the built docs * update of api docstrings across coach and tutorials 0-2 * added some missing api documentation * New Sphinx based documentation
2026-03-03 15:25:49 +01:00 · 2018-11-15 15:00:13 +02:00
parent 524f8436a2
commit 6d40ad1650
517 changed files with 71034 additions and 12834 deletions
--- a/docs/_sources/components/exploration_policies/index.rst.txt
+++ b/docs/_sources/components/exploration_policies/index.rst.txt
@@ -0,0 +1,87 @@
+Exploration Policies
+====================
+
+Exploration policies are a component that allow the agent to tradeoff exploration and exploitation according to a
+predefined policy. This is one of the most important aspects of reinforcement learning agents, and can require some
+tuning to get it right. Coach supports several pre-defined exploration policies, and it can be easily extended with
+custom policies. Note that not all exploration policies are expected to work for both discrete and continuous action
+spaces.
+
+.. role:: green
+.. role:: red
+
+----------------------+-----------------------+------------------+
+| Exploration Policy   | Discrete Action Space | Box Action Space |
+======================+=======================+==================+
+| AdditiveNoise        | :red:`X`              | :green:`V`       |
+----------------------+-----------------------+------------------+
+| Boltzmann            | :green:`V`            | :red:`X`         |
+----------------------+-----------------------+------------------+
+| Bootstrapped         | :green:`V`            | :red:`X`         |
+----------------------+-----------------------+------------------+
+| Categorical          | :green:`V`            | :red:`X`         |
+----------------------+-----------------------+------------------+
+| ContinuousEntropy    | :red:`X`              | :green:`V`       |
+----------------------+-----------------------+------------------+
+| EGreedy              | :green:`V`            | :green:`V`       |
+----------------------+-----------------------+------------------+
+| Greedy               | :green:`V`            | :green:`V`       |
+----------------------+-----------------------+------------------+
+| OUProcess            | :red:`X`              | :green:`V`       |
+----------------------+-----------------------+------------------+
+| ParameterNoise       | :green:`V`            | :green:`V`       |
+----------------------+-----------------------+------------------+
+| TruncatedNormal      | :red:`X`              | :green:`V`       |
+----------------------+-----------------------+------------------+
+| UCB                  | :green:`V`            | :red:`X`         |
+----------------------+-----------------------+------------------+
+
+ExplorationPolicy
+-----------------
+.. autoclass:: rl_coach.exploration_policies.ExplorationPolicy
+   :members:
+   :inherited-members:
+
+AdditiveNoise
+-------------
+.. autoclass:: rl_coach.exploration_policies.AdditiveNoise
+
+Boltzmann
+---------
+.. autoclass:: rl_coach.exploration_policies.Boltzmann
+
+Bootstrapped
+------------
+.. autoclass:: rl_coach.exploration_policies.Bootstrapped
+
+Categorical
+-----------
+.. autoclass:: rl_coach.exploration_policies.Categorical
+
+ContinuousEntropy
+-----------------
+.. autoclass:: rl_coach.exploration_policies.ContinuousEntropy
+
+EGreedy
+-------
+.. autoclass:: rl_coach.exploration_policies.EGreedy
+
+Greedy
+------
+.. autoclass:: rl_coach.exploration_policies.Greedy
+
+OUProcess
+---------
+.. autoclass:: rl_coach.exploration_policies.OUProcess
+
+ParameterNoise
+--------------
+.. autoclass:: rl_coach.exploration_policies.ParameterNoise
+
+TruncatedNormal
+---------------
+.. autoclass:: rl_coach.exploration_policies.TruncatedNormal
+
+UCB
+---
+.. autoclass:: rl_coach.exploration_policies.UCB