1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 19:50:17 +01:00
Files
coach/docs/_sources/components/exploration_policies/index.rst.txt
Gal Leibovich f12857a8c7 Docs changes - fixing blogpost links, removing importing all exploration policies (#139)
* updated docs

* removing imports for all exploration policies in __init__ + setting the right blog-post link

* small cleanups
2018-12-05 16:16:16 -05:00

87 lines
3.3 KiB
ReStructuredText

Exploration Policies
====================
Exploration policies are a component that allow the agent to tradeoff exploration and exploitation according to a
predefined policy. This is one of the most important aspects of reinforcement learning agents, and can require some
tuning to get it right. Coach supports several pre-defined exploration policies, and it can be easily extended with
custom policies. Note that not all exploration policies are expected to work for both discrete and continuous action
spaces.
.. role:: green
.. role:: red
+----------------------+-----------------------+------------------+
| Exploration Policy | Discrete Action Space | Box Action Space |
+======================+=======================+==================+
| AdditiveNoise | :red:`X` | :green:`V` |
+----------------------+-----------------------+------------------+
| Boltzmann | :green:`V` | :red:`X` |
+----------------------+-----------------------+------------------+
| Bootstrapped | :green:`V` | :red:`X` |
+----------------------+-----------------------+------------------+
| Categorical | :green:`V` | :red:`X` |
+----------------------+-----------------------+------------------+
| ContinuousEntropy | :red:`X` | :green:`V` |
+----------------------+-----------------------+------------------+
| EGreedy | :green:`V` | :green:`V` |
+----------------------+-----------------------+------------------+
| Greedy | :green:`V` | :green:`V` |
+----------------------+-----------------------+------------------+
| OUProcess | :red:`X` | :green:`V` |
+----------------------+-----------------------+------------------+
| ParameterNoise | :green:`V` | :green:`V` |
+----------------------+-----------------------+------------------+
| TruncatedNormal | :red:`X` | :green:`V` |
+----------------------+-----------------------+------------------+
| UCB | :green:`V` | :red:`X` |
+----------------------+-----------------------+------------------+
ExplorationPolicy
-----------------
.. autoclass:: rl_coach.exploration_policies.exploration_policy.ExplorationPolicy
:members:
:inherited-members:
AdditiveNoise
-------------
.. autoclass:: rl_coach.exploration_policies.additive_noise.AdditiveNoise
Boltzmann
---------
.. autoclass:: rl_coach.exploration_policies.boltzmann.Boltzmann
Bootstrapped
------------
.. autoclass:: rl_coach.exploration_policies.bootstrapped.Bootstrapped
Categorical
-----------
.. autoclass:: rl_coach.exploration_policies.categorical.Categorical
ContinuousEntropy
-----------------
.. autoclass:: rl_coach.exploration_policies.continuous_entropy.ContinuousEntropy
EGreedy
-------
.. autoclass:: rl_coach.exploration_policies.e_greedy.EGreedy
Greedy
------
.. autoclass:: rl_coach.exploration_policies.greedy.Greedy
OUProcess
---------
.. autoclass:: rl_coach.exploration_policies.ou_process.OUProcess
ParameterNoise
--------------
.. autoclass:: rl_coach.exploration_policies.parameter_noise.ParameterNoise
TruncatedNormal
---------------
.. autoclass:: rl_coach.exploration_policies.truncated_normal.TruncatedNormal
UCB
---
.. autoclass:: rl_coach.exploration_policies.ucb.UCB