1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 11:40:18 +01:00
Files
coach/benchmarks/qr_dqn/README.md
2018-08-19 14:23:20 +03:00

22 lines
522 B
Markdown

# Quantile Regression DQN
Each experiment uses 3 seeds and is trained for 10k environment steps.
The parameters used for QR-DQN are the same parameters as described in the [original paper](https://arxiv.org/abs/1710.10044.pdf).
### Breakout QR-DQN - single worker
```bash
coach -p Atari_QR_DQN -lvl breakout
```
<img src="breakout_qr_dqn.png" alt="Breakout QR-DQN" width="800"/>
### Pong QR-DQN - single worker
```bash
coach -p Atari_QR_DQN -lvl pong
```
<img src="pong_qr_dqn.png" alt="Pong QR-DQN" width="800"/>