gryf/coach

mirror of https://github.com/gryf/coach.git synced 2026-02-14 04:45:50 +01:00

Files

Gal Leibovich 310d31c227 integration test changes to reach the train part (#254 )

* integration test changes to override heatup to 1000 steps +  run each preset for 30 sec (to make sure we reach the train part)

* fixes to failing presets uncovered with this change + changes in the golden testing to properly test BatchRL

* fix for rainbow dqn

* fix to gym_environment (due to a change in Gym 0.12.1) + fix for rainbow DQN + some bug-fix in utils.squeeze_list

* fix for NEC agent

2019-03-27 21:14:19 +02:00

__init__.py

pre-release 0.10.0

2018-08-13 17:11:34 +03:00

Atari_A3C_LSTM.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

Atari_A3C.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

Atari_ACER.py

benchmark update (#250 )

2019-03-17 15:33:28 +02:00

Atari_Bootstrapped_DQN.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_C51.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_DDQN_with_PER.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_DDQN.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_DQN_with_PER.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_DQN.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_Dueling_DDQN_with_PER_OpenAI.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

Atari_Dueling_DDQN.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

Atari_NEC.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_NStepQ.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

Atari_QR_DQN.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_Rainbow.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Atari_UCB_with_Q_Ensembles.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

BitFlip_DQN_HER.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

BitFlip_DQN.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

CARLA_3_Cameras_DDPG.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CARLA_CIL.py

Batch RL (#238 )

2019-03-19 18:07:09 +02:00

CARLA_DDPG.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CARLA_Dueling_DDQN.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

CartPole_A3C.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CartPole_ACER.py

ACER algorithm (#184 )

2019-02-20 23:52:34 +02:00

CartPole_ClippedPPO.py

Save filters' internal state (#127 )

2018-11-20 17:21:48 +02:00

CartPole_DFP.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CartPole_DQN_BatchRL.py

integration test changes to reach the train part (#254 )

2019-03-27 21:14:19 +02:00

CartPole_DQN.py

Enable distributed SharedRunningStats (#81 )

2018-11-13 19:17:38 +02:00

CartPole_Dueling_DDQN.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

CartPole_NEC.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CartPole_NStepQ.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CartPole_PAL.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CartPole_PG.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

CartPole_QR_DQN.py

Enabling-more-agents-for-Batch-RL-and-cleanup (#258 )

2019-03-21 16:10:29 +02:00

CartPole_Rainbow.py

fixes to rainbow dqn + a cartpole based golden test (#253 )

2019-03-21 12:57:56 +02:00

ControlSuite_DDPG.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

Doom_Basic_A3C.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Doom_Basic_ACER.py

ACER algorithm (#184 )

2019-02-20 23:52:34 +02:00

Doom_Basic_BC.py

Batch RL (#238 )

2019-03-19 18:07:09 +02:00

Doom_Basic_DFP.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Doom_Basic_DQN.py

Setup basic CI flow (#38 )

2018-10-24 18:27:58 -07:00

Doom_Basic_Dueling_DDQN.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

Doom_Battle_DFP.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Doom_Health_DFP.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Doom_Health_MMC.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Doom_Health_Supreme_DFP.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

ExplorationChain_Bootstrapped_DQN.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

ExplorationChain_Dueling_DDQN.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

ExplorationChain_UCB_Q_ensembles.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Fetch_DDPG_HER_baselines.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

InvertedPendulum_PG.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

MontezumaRevenge_BC.py

Batch RL (#238 )

2019-03-19 18:07:09 +02:00

Mujoco_A3C_LSTM.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

Mujoco_A3C.py

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Mujoco_ClippedPPO.py

Adding framework for multinode tests (#149 )

2019-02-26 13:53:12 -08:00

Mujoco_DDPG.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

Mujoco_NAF.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

Mujoco_PPO.py

Enable distributed SharedRunningStats (#81 )

2018-11-13 19:17:38 +02:00

Pendulum_HAC.py

Removed tensorflow specific code in presets (#59 )

2018-11-06 17:39:29 +02:00

README.md

network_imporvements branch merge

2018-10-02 13:43:36 +03:00

Starcraft_CollectMinerals_A3C.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

Starcraft_CollectMinerals_Dueling_DDQN.py

Move embedder, middleware, and head parameters to framework agnostic modules. (#45 )

2018-10-29 14:46:40 -07:00

README.md

Defining Presets

In Coach, we use a Preset mechanism in order to define reproducible experiments. A Preset defines all the parameters of an experiment in a single file, and can be executed from the command line using the file name. Presets can be very simple by using the default parameters of the algorithm and environment. They can also be explicit and define all the parameters in order to avoid hidden logic. The outcome of a preset is a GraphManager.

Let's start with the simplest preset possible. We will define a preset for training the CartPole environment using Clipped PPO. The 3 minimal things we need to define in each preset are the agent, the environment and a schedule.

from rl_coach.agents.clipped_ppo_agent import ClippedPPOAgentParameters
from rl_coach.environments.gym_environment import GymVectorEnvironment
from rl_coach.graph_managers.basic_rl_graph_manager import BasicRLGraphManager
from rl_coach.graph_managers.graph_manager import SimpleSchedule

graph_manager = BasicRLGraphManager(
    agent_params=ClippedPPOAgentParameters(),
    env_params=GymVectorEnvironment(level='CartPole-v0'),
    schedule_params=SimpleSchedule()
)

Most presets in Coach are much more explicit than this. The motivation behind this is to be as transparent as possible regarding all the changes needed relative to the basic parameters defined in the algorithm paper.