mirror of
https://github.com/gryf/coach.git
synced 2026-01-27 18:45:45 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
4.0 KiB
4.0 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Entropy/Mean | Entropy/Stdev | Entropy/Max | Entropy/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min | Q Values/Mean | Q Values/Stdev | Q Values/Max | Q Values/Min | Value Loss/Mean | Value Loss/Stdev | Value Loss/Max | Value Loss/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 486.0 | 1.0 | 486.0 | 486.0 | 0.5 | 0.0 | |||||||||||||||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 87.0 | 1.0 | 87.0 | 573.0 | 0.5 | 0.0 | |||||||||||||||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 149.0 | 1.0 | 149.0 | 722.0 | 0.5 | 0.0 | |||||||||||||||||||||||||||||||||
| 5 | 4 | 0.0 | 1.0 | 335.0 | 1.0 | 335.0 | 1057.0 | 0.5 | 0.0 | |||||||||||||||||||||||||||||||||
| 6 | 5 | 30.0 | 0.0 | 152.0 | 1.0 | 152.0 | 1209.0 | 0.4985103999999994 | 2.0 | 15.0 | 0.0 | 0.01806015 | 0.031154681 | 0.11256089999999999 | -0.03153646 | 0.040528998 | 0.15495184 | 0.7766681 | 7.3362275999999995e-06 | |||||||||||||||||||||||
| 7 | 6 | 84.0 | 0.0 | 270.0 | 1.0 | 270.0 | 1479.0 | 0.4958643999999982 | 8.0 | 120.0 | 0.0 | 0.04832644 | 0.029433122000000003 | 0.116728835 | -0.013950496000000001 | 0.054705366 | 0.14291170000000003 | 0.70854425 | 0.00038360796 | |||||||||||||||||||||||
| 8 | 7 | 149.0 | 0.0 | 324.0 | 1.0 | 324.0 | 1803.0 | 0.4926891999999968 | 9.0 | 120.0 | 0.0 | 0.06519938 | 0.036996773999999996 | 0.20431875 | -0.0006479138 | 0.09192595 | 0.23194770000000003 | 0.8836251 | 9.27005e-05 | |||||||||||||||||||||||
| 9 | 8 | 197.0 | 0.0 | 237.0 | 1.0 | 237.0 | 2040.0 | 0.4903665999999958 | 6.0 | 70.0 | 0.0 | 0.081993505 | 0.031750474 | 0.17067611 | 0.01893028 | 0.06609978 | 0.17276828 | 0.84518427 | 0.00049587624 | |||||||||||||||||||||||
| 10 | 9 | 231.0 | 0.0 | 171.0 | 1.0 | 171.0 | 2211.0 | 0.4886907999999951 | 0.0 | 0.0 | 0.0 | 0.06561219 | 0.027578448999999998 | 0.16583520000000002 | 0.0023947426000000003 | 0.0045568603 | 0.0040728809999999996 | 0.019272441 | 0.0012047348 | |||||||||||||||||||||||
| 11 | 10 | 352.0 | 0.0 | 604.0 | 1.0 | 604.0 | 2815.0 | 0.4827715999999925 | 16.0 | 240.0 | 0.0 | 0.054065555 | 0.029770117000000002 | 0.14167584 | -0.025165185 | 0.06984026 | 0.19431692 | 0.8947495999999999 | 1.5888494e-05 | |||||||||||||||||||||||
| 12 | 11 | 399.0 | 0.0 | 232.0 | 1.0 | 232.0 | 3047.0 | 0.4804979999999915 | 4.0 | 25.0 | 0.0 | 0.09317397 | 0.037268302999999996 | 0.1879414 | 0.017247636 | 0.04507253 | 0.13425689999999998 | 0.7788928 | 0.0016535529999999999 | |||||||||||||||||||||||
| 13 | 12 | 430.0 | 0.0 | 154.0 | 1.0 | 154.0 | 3201.0 | 0.4789887999999909 | 2.0 | 15.0 | 0.0 | 0.060374584 | 0.026983725 | 0.1327358 | 0.00017325346999999997 | 0.03700112 | 0.15603235 | 0.8584324000000001 | 9.365492e-05 | |||||||||||||||||||||||
| 14 | 13 | 464.0 | 0.0 | 169.0 | 1.0 | 169.0 | 3370.0 | 0.4773325999999902 | 3.0 | 60.0 | 0.0 | 0.07076912 | 0.024960317000000003 | 0.171489 | 0.022848563 | 0.07708849 | 0.23528506 | 0.8999446 | 0.0009268887 | |||||||||||||||||||||||
| 15 | 14 | 502.0 | 0.0 | 189.0 | 1.0 | 189.0 | 3559.0 | 0.4754803999999894 | 4.0 | 50.0 | 0.0 | 0.08175371599999999 | 0.059707563 | 0.23806223 | -0.0022388997 | 0.080376275 | 0.21826938 | 0.9014337 | 0.000156738 | |||||||||||||||||||||||
| 16 | 15 | 530.0 | 0.0 | 138.0 | 1.0 | 138.0 | 3697.0 | 0.4741279999999888 | 1.0 | 25.0 | 0.0 | 0.06588258599999999 | 0.031772457000000004 | 0.16913122 | -0.0033009246000000004 | 0.037671477 | 0.17314273 | 0.92035943 | 0.0008084267599999999 | |||||||||||||||||||||||
| 17 | 16 | 549.0 | 0.0 | 95.0 | 1.0 | 95.0 | 3792.0 | 0.4731969999999884 | 1.0 | 30.0 | 0.0 | 0.08419551 | 0.021721134 | 0.13456126 | 0.042178745999999996 | 0.022773635 | 0.08062003 | 0.35514408 | 0.00152435 | |||||||||||||||||||||||
| 18 | 17 | 630.0 | 0.0 | 404.0 | 1.0 | 404.0 | 4196.0 | 0.4692377999999866 | 9.0 | 75.0 | 0.0 | 0.06739886 | 0.03467486 | 0.14949478 | -0.03615028 | 0.063086614 | 0.17489205 | 0.7053564 | 0.00030611295 | |||||||||||||||||||||||
| 19 | 18 | 714.0 | 0.0 | 420.0 | 1.0 | 420.0 | 4616.0 | 0.4651217999999849 | 10.0 | 160.0 | 0.0 | 0.06760059 | 0.022386358999999998 | 0.1480442 | 0.0114746755 | 0.065966256 | 0.18684588 | 0.90956885 | 0.00010415676 | |||||||||||||||||||||||
| 20 | 19 | 809.0 | 0.0 | 473.0 | 1.0 | 473.0 | 5089.0 | 0.4604863999999829 | 7.0 | 135.0 | 0.0 | 0.057970256 | 0.020290807 | 0.13571687 | 0.012275728 | 0.04331644 | 0.16250839999999997 | 0.8834267 | 0.00020417363000000002 | |||||||||||||||||||||||
| 21 | 20 | 850.0 | 0.0 | 204.0 | 1.0 | 204.0 | 5293.0 | 0.45848719999998205 | 3.0 | 20.0 | 0.0 | 0.07100958 | 0.030987637000000002 | 0.14825977 | 0.011810873000000001 | 0.047851723 | 0.16509958 | 0.8592968000000001 | 0.00032094717999999996 |