1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 03:30:19 +01:00
Files
coach/docs/_sources/components/agents/policy_optimization/hac.rst.txt
Itai Caspi 6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00

25 lines
730 B
ReStructuredText

Hierarchical Actor Critic
=========================
**Actions space:** Continuous
**References:** `Hierarchical Reinforcement Learning with Hindsight <https://arxiv.org/abs/1805.08180>`_
Network Structure
-----------------
.. image:: /_static/img/design_imgs/ddpg.png
:align: center
Algorithm Description
---------------------
Choosing an action
++++++++++++++++++
Pass the current states through the actor network, and get an action mean vector :math:`\mu`.
While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process,
to add exploration noise to the action. When testing, use the mean vector :math:`\mu` as-is.
Training the network
++++++++++++++++++++