mirror of
https://github.com/gryf/coach.git
synced 2025-12-17 11:10:20 +01:00
DQN
Each experiment uses 3 seeds. The parameters used for DQN are the same parameters as described in the original paper, except for the optimizer (changed to ADAM) and learning rate (1e-4) used.
Breakout DQN - single worker
coach -p Atari_DQN -lvl breakout
Pong DQN - single worker
coach -p Atari_DQN -lvl pong
Space Invaders DQN - single worker
coach -p Atari_DQN -lvl space_invaders