* initial ACER commit * Code cleanup + several fixes * Q-retrace bug fix + small clean-ups * added documentation for acer * ACER benchmarks * update benchmarks table * Add nightly running of golden and trace tests. (#202) Resolves #200 * comment out nightly trace tests until values reset. * remove redundant observe ignore (#168) * ensure nightly test env containers exist. (#205) Also bump integration test timeout * wxPython removal (#207) Replacing wxPython with Python's Tkinter. Also removing the option to choose multiple files as it is unused and causes errors, and fixing the load file/directory spinner. * Create CONTRIBUTING.md (#210) * Create CONTRIBUTING.md. Resolves #188 * run nightly golden tests sequentially. (#217) Should reduce resource requirements and potential CPU contention but increases overall execution time. * tests: added new setup configuration + test args (#211) - added utils for future tests and conftest - added test args * new docs build * golden test update
3.1 KiB
Coach Benchmarks
The following table represents the current status of algorithms implemented in Coach relative to the results reported in the original papers. The detailed results for each algorithm can be seen by clicking on its name.
The X axis in all the figures is the total steps (for multi-threaded runs, this is the number of steps per worker). The Y axis in all the figures is the average episode reward with an averaging window of 100 timesteps.
For each algorithm, there is a command line for reproducing the results of each graph. These are the results you can expect to get when running the pre-defined presets in Coach.
The environments that were used for testing include:
- Atari - Breakout, Pong and Space Invaders
- Mujoco - Inverted Pendulum, Inverted Double Pendulum, Reacher, Hopper, Half Cheetah, Walker 2D, Ant, Swimmer and Humanoid.
- Doom - Basic, Health Gathering (D1: Basic), Health Gathering Supreme (D2: Navigation), Battle (D3: Battle)
- Fetch - Reach, Slide, Push, Pick-and-Place
Summary
Reproducing paper's results for some of the environments
Training but not reproducing paper's results
| Status | Environments | Comments | |
|---|---|---|---|
| DQN | Atari | ||
| Dueling DDQN | Atari | ||
| Dueling DDQN with PER | Atari | ||
| Bootstrapped DQN | Atari | ||
| QR-DQN | Atari | ||
| A3C | Atari, Mujoco | ||
| ACER | Atari | ||
| Clipped PPO | Mujoco | ||
| DDPG | Mujoco | ||
| NEC | Atari | ||
| HER | Fetch | ||
| DFP | Doom | Doom Battle was not verified |
Click on each algorithm to see detailed benchmarking results