SAC algorithm (#282)

* SAC algorithm * SAC - updates to agent (learn_from_batch), sac_head and sac_q_head to fix problem in gradient calculation. Now SAC agents is able to train. gym_environment - fixing an error in access to gym.spaces * Soft Actor Critic - code cleanup * code cleanup * V-head initialization fix * SAC benchmarks * SAC Documentation * typo fix * documentation fixes * documentation and version update * README typo
2026-02-13 04:15:45 +01:00 · 2019-05-01 18:37:49 +03:00
parent 33dc29ee99
commit 74db141d5e
92 changed files with 2812 additions and 402 deletions
--- a/docs_raw/source/components/agents/index.rst
+++ b/docs_raw/source/components/agents/index.rst
@@ -21,6 +21,7 @@ A detailed description of those algorithms can be found by navigating to each of
   imitation/cil
   policy_optimization/cppo
   policy_optimization/ddpg
+   policy_optimization/sac
   other/dfp
   value_optimization/double_dqn
   value_optimization/dqn