mirror of
https://github.com/gryf/coach.git
synced 2025-12-17 19:20:19 +01:00
* Multiple improvements and bug fixes:
* Using lazy stacking to save on memory when using a replay buffer
* Remove step counting for evaluation episodes
* Reset game between heatup and training
* Major bug fixes in NEC (is reproducing the paper results for pong now)
* Image input rescaling to 0-1 is now optional
* Change the terminal title to be the experiment name
* Observation cropping for atari is now optional
* Added random number of noop actions for gym to match the dqn paper
* Fixed a bug where the evaluation episodes won't start with the max possible ale lives
* Added a script for plotting the results of an experiment over all the atari games
Coach Benchmarks
The following figures are training curves of some of the presets available through Coach. The X axis in all the figures is the total steps (for multi-threaded runs, this is the accumulated number of steps over all the workers). The Y axis in all the figures is the average episode reward with an averaging window of 11 episodes. These are the results you can expect to get when running the pre-defined presets in Coach.
A3C
Breakout_A3C with 16 workers
python3 coach.py -p Breakout_A3C -n 16 -r
InvertedPendulum_A3C with 16 workers
python3 coach.py -p InvertedPendulum_A3C -n 16 -r
Hopper_A3C with 16 workers
python3 coach.py -p Hopper_A3C -n 16 -r
Ant_A3C with 16 workers
python3 coach.py -p Ant_A3C -n 16 -r
Clipped PPO
InvertedPendulum_ClippedPPO with 16 workers
python3 coach.py -p InvertedPendulum_ClippedPPO -n 16 -r
Hopper_ClippedPPO with 16 workers
python3 coach.py -p Hopper_ClippedPPO -n 16 -r
Humanoid_ClippedPPO with 16 workers
python3 coach.py -p Humanoid_ClippedPPO -n 16 -r
DQN
Pong_DQN
python3 coach.py -p Pong_DQN -r
Doom_Basic_DQN
python3 coach.py -p Doom_Basic_DQN -r
Dueling DDQN
Doom_Basic_Dueling_DDQN
python3 coach.py -p Doom_Basic_Dueling_DDQN -r
DFP
Doom_Health_DFP
python3 coach.py -p Doom_Health_DFP -r
MMC
Doom_Health_MMC
python3 coach.py -p Doom_Health_MMC -r
NEC
Pong_NEC
python3 coach.py -p Pong_NEC -r
Doom_Basic_NEC
python3 coach.py -p Doom_Basic_NEC -r
PG
CartPole_PG
python3 coach.py -p CartPole_PG -r
DDPG
Pendulum_DDPG
python3 coach.py -p Pendulum_DDPG -r
NAF
InvertedPendulum_NAF
python3 coach.py -p InvertedPendulum_NAF -r
Pendulum_NAF
python3 coach.py -p Pendulum_NAF -r