mirror of
https://github.com/gryf/coach.git
synced 2025-12-18 03:30:19 +01:00
* updating the documentation website * adding the built docs * update of api docstrings across coach and tutorials 0-2 * added some missing api documentation * New Sphinx based documentation
730 B
730 B
Actions space: Continuous
References: Hierarchical Reinforcement Learning with Hindsight
Network Structure
Algorithm Description
Choosing an action
Pass the current states through the actor network, and get an action mean vector μ. While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process, to add exploration noise to the action. When testing, use the mean vector μ as-is.