mirror of
https://github.com/gryf/coach.git
synced 2026-02-20 08:45:55 +01:00
* A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean
14 KiB
14 KiB