mirror of
https://github.com/gryf/coach.git
synced 2025-12-17 19:20:19 +01:00
ACER
Each experiment uses 3 seeds. The parameters used for ACER are the same parameters as described in the original paper, except for the optimizer (changed to ADAM) and learning rate (1e-4) used.
Breakout ACER - 16 workers
coach -p Atari_ACER -lvl breakout -n 16
Space Invaders ACER - 16 workers
coach -p Atari_ACER -lvl space_invaders -n 16
Pong ACER - 16 workers
coach -p Atari_ACER -lvl pong -n 16