mirror of
https://github.com/gryf/coach.git
synced 2025-12-18 11:40:18 +01:00
RL in Large Discrete Action Spaces - Wolpertinger Agent (#394)
* Currently this is specific to the case of discretizing a continuous action space. Can easily be adapted to other case by feeding the kNN otherwise, and removing the usage of a discretizing output action filter
This commit is contained in:
@@ -21,8 +21,6 @@ A detailed description of those algorithms can be found by navigating to each of
|
||||
imitation/cil
|
||||
policy_optimization/cppo
|
||||
policy_optimization/ddpg
|
||||
policy_optimization/td3
|
||||
policy_optimization/sac
|
||||
other/dfp
|
||||
value_optimization/double_dqn
|
||||
value_optimization/dqn
|
||||
@@ -36,6 +34,10 @@ A detailed description of those algorithms can be found by navigating to each of
|
||||
policy_optimization/ppo
|
||||
value_optimization/rainbow
|
||||
value_optimization/qr_dqn
|
||||
policy_optimization/sac
|
||||
policy_optimization/td3
|
||||
policy_optimization/wolpertinger
|
||||
|
||||
|
||||
|
||||
.. autoclass:: rl_coach.base_parameters.AgentParameters
|
||||
|
||||
Reference in New Issue
Block a user