Gal Leibovich
a1bb8eef89
DDPG Critic Head Bug Fix (#344)
* A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean
2019-06-05 17:47:56 +03:00
..
2019-06-05 17:47:56 +03:00
2019-03-03 15:11:06 +02:00
2019-03-21 16:10:29 +02:00
2018-10-29 14:46:40 -07:00
2019-04-16 17:06:23 +03:00
2018-11-09 08:17:04 -08:00
2019-06-05 17:47:56 +03:00
2019-03-21 16:10:29 +02:00
2019-03-21 16:10:29 +02:00
2019-05-26 17:15:42 +03:00
2018-10-29 14:46:40 -07:00
2018-10-29 14:46:40 -07:00
2019-03-03 15:11:06 +02:00
2019-03-03 15:11:06 +02:00
2018-10-29 14:46:40 -07:00
2019-03-21 16:10:29 +02:00
2019-03-21 16:10:29 +02:00
2019-03-21 16:10:29 +02:00
2019-05-01 18:37:49 +03:00
2019-05-01 18:37:49 +03:00
2019-05-01 18:37:49 +03:00