1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 19:50:17 +01:00
Files
coach/docs_raw/source/components/agents/policy_optimization/hac.rst
Itai Caspi 6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00

730 B

Actions space: Continuous

References: Hierarchical Reinforcement Learning with Hindsight

Network Structure

/_static/img/design_imgs/ddpg.png

Algorithm Description

Choosing an action

Pass the current states through the actor network, and get an action mean vector μ. While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process, to add exploration noise to the action. When testing, use the mean vector μ as-is.

Training the network