mirror of
https://github.com/gryf/coach.git
synced 2026-02-24 11:15:45 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
6.3 KiB
6.3 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 486.0 | 486.0 | 486.0 | 486.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 573.0 | 573.0 | 87.0 | 573.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 722.0 | 722.0 | 149.0 | 722.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 5 | 4 | 0.0 | 1.0 | 1057.0 | 1057.0 | 335.0 | 1057.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 6 | 5 | 51.0 | 0.0 | 1260.0 | 1260.0 | 203.0 | 1260.0 | 0.9997990300000044 | 5.0 | 55.0 | 0.0 | 0.011159177685519406 | 0.014670889632016437 | 0.05586982890963554 | 0.00014776474563404918 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.06980593 | 0.04550845 | 0.23672238 | 0.012701003 | |||||||
| 7 | 6 | 70.0 | 0.0 | 1335.0 | 1335.0 | 75.0 | 1335.0 | 0.999724780000006 | 2.0 | 15.0 | 0.0 | 0.011363721369937258 | 0.01113743869358625 | 0.02980226650834084 | 0.00037189509021118283 | 0.0001 | 0.0 | 0.0001 | 0.0001 | 0.061391924 | 0.02929353 | 0.110355645 | 0.014809295 | |||||||
| 8 | 7 | 91.0 | 0.0 | 1422.0 | 1422.0 | 87.0 | 1422.0 | 0.9996386500000078 | 1.0 | 15.0 | 0.0 | 0.011624419426966813 | 0.013655546794886331 | 0.04332234337925911 | 0.0002809247234836221 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.065822564 | 0.03866646 | 0.16192491 | 0.021505926 | |||||||
| 9 | 8 | 159.0 | 0.0 | 1693.0 | 1693.0 | 271.0 | 1693.0 | 0.9993703600000136 | 5.0 | 55.0 | 0.0 | 0.008774163406046885 | 0.01056874345135505 | 0.042830634862184525 | 0.0001717623672448099 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.05377827 | 0.032895934 | 0.16196515 | 0.00941183 | |||||||
| 10 | 9 | 201.0 | 0.0 | 1861.0 | 1861.0 | 168.0 | 1861.0 | 0.9992040400000172 | 3.0 | 50.0 | 0.0 | 0.005785983745660079 | 0.00916921106544553 | 0.04197276383638382 | 0.00021147351071704182 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.04083436 | 0.028236978 | 0.14960715 | 0.012452139499999999 | |||||||
| 11 | 10 | 279.0 | 0.0 | 2172.0 | 2172.0 | 311.0 | 2172.0 | 0.998896150000024 | 4.0 | 65.0 | 0.0 | 0.009643888652461987 | 0.012217253908083524 | 0.056894369423389435 | 0.00016667474119458348 | 0.00010000000000000003 | 4.0657581468206416e-20 | 0.0001 | 0.0001 | 0.0560394 | 0.039682023 | 0.1913731 | 0.008496345 | |||||||
| 12 | 11 | 440.0 | 0.0 | 2815.0 | 2815.0 | 643.0 | 2815.0 | 0.9982595800000378 | 10.0 | 335.0 | 0.0 | 0.00861433394022231 | 0.010348496102126023 | 0.04432229697704315 | 0.0001522430102340877 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.05220816 | 0.03156275 | 0.16941099999999998 | 0.009055335999999999 | 0.023975812000000003 | 0.012025122 | 0.040999293 | 0.0039341394 | |||
| 13 | 12 | 458.0 | 0.0 | 2888.0 | 2888.0 | 73.0 | 2888.0 | 0.9981873100000394 | 2.0 | 45.0 | 0.0 | 0.009530787179957971 | 0.010701133845478595 | 0.03999783843755722 | 0.00031216014758683736 | 0.0001 | 0.0 | 0.0001 | 0.0001 | 0.062311728 | 0.026771976 | 0.14921491 | 0.02184285 | |||||||
| 14 | 13 | 478.0 | 0.0 | 2969.0 | 2969.0 | 81.0 | 2969.0 | 0.9981071200000412 | 0.0 | 0.0 | 0.0 | 0.006778308925277089 | 0.00818221383451154 | 0.02816874161362648 | 0.0002485926379449665 | 0.0001 | 0.0 | 0.0001 | 0.0001 | 0.044908725 | 0.026523566000000002 | 0.117372274 | 0.016029207 | -0.012888752 | 0.011372567 | -0.0004688017 | -0.031455092000000004 | |||
| 15 | 14 | 532.0 | 0.0 | 3183.0 | 3183.0 | 214.0 | 3183.0 | 0.9978952600000456 | 4.0 | 50.0 | 0.0 | 0.0069169835669863795 | 0.009669208516406907 | 0.030461043119430545 | 0.0002085747983073816 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.047045134 | 0.03206875 | 0.123922676 | 0.010909475 | |||||||
| 16 | 15 | 551.0 | 0.0 | 3262.0 | 3262.0 | 79.0 | 3262.0 | 0.9978170500000474 | 2.0 | 15.0 | 0.0 | 0.008465479291963243 | 0.01222064269286522 | 0.04136095941066742 | 0.00024427170865237713 | 0.0001 | 0.0 | 0.0001 | 0.0001 | 0.051463115999999996 | 0.036649507000000005 | 0.14754368 | 0.013367546000000001 | |||||||
| 17 | 16 | 626.0 | 0.0 | 3560.0 | 3560.0 | 298.0 | 3560.0 | 0.9975220300000538 | 6.0 | 145.0 | 0.0 | 0.00915405042571758 | 0.00990793648084188 | 0.038579311221838 | 0.0002719838812481612 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.055197 | 0.029597567 | 0.14903925 | 0.010181868 | -0.011365135 | 0.013168923 | 0.013799908 | -0.025668386 | |||
| 18 | 17 | 916.0 | 0.0 | 4719.0 | 4719.0 | 1159.0 | 4719.0 | 0.9963746200000788 | 22.0 | 340.0 | 0.0 | 0.00762058505652962 | 0.009529140287309196 | 0.04469184204936028 | 0.00013019611651543528 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.049977854 | 0.031034742999999997 | 0.15861663 | 0.008030025 | 0.016384887 | 0.018671772 | 0.056820348 | -0.0064531965 | |||
| 19 | 18 | 943.0 | 0.0 | 4830.0 | 4830.0 | 111.0 | 4830.0 | 0.9962647300000812 | 3.0 | 45.0 | 0.0 | 0.009809370868390907 | 0.011517441918305206 | 0.04567621275782585 | 0.0002561989240348339 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.058472212 | 0.027904749 | 0.13957499 | 0.021250565 | |||||||
| 20 | 19 | 1006.0 | 0.0 | 5081.0 | 5081.0 | 251.0 | 5081.0 | 0.9960162400000864 | 0.0 | 0.0 | 0.0 | 0.011435095637646171 | 0.011009362492248348 | 0.04369494318962097 | 0.0002374518953729421 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.06413252 | 0.03206017 | 0.17680877 | 0.015071165 | |||||||
| 21 | 20 | 1062.0 | 0.0 | 5304.0 | 5304.0 | 223.0 | 5304.0 | 0.9957954700000912 | 6.0 | 105.0 | 0.0 | 0.008425145343997948 | 0.010476737255273365 | 0.04278689250349999 | 0.0002318604965694249 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.05528172 | 0.033389688 | 0.16662869 | 0.013080816 | 0.02990371 | 0.01963076 | 0.059646500000000005 | 0.005463021 | |||
| 22 | 21 | 1081.0 | 0.0 | 5379.0 | 5379.0 | 75.0 | 5379.0 | 0.9957212200000928 | 2.0 | 15.0 | 0.0 | 0.007715215652225245 | 0.010669677149893777 | 0.03837666660547257 | 0.00035348522942513233 | 0.0001 | 0.0 | 0.0001 | 0.0001 | 0.055993587000000004 | 0.030775022000000003 | 0.13761143 | 0.023046900000000002 | |||||||
| 23 | 22 | 1125.0 | 0.0 | 5556.0 | 5556.0 | 177.0 | 5556.0 | 0.9955459900000968 | 2.0 | 35.0 | 0.0 | 0.007088124197418272 | 0.007553918383053855 | 0.02735380455851555 | 0.00016635317297186702 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.046250745999999995 | 0.027780753 | 0.12917177 | 0.009924813000000001 | |||||||
| 24 | 23 | 1169.0 | 0.0 | 5733.0 | 5733.0 | 177.0 | 5733.0 | 0.9953707600001004 | 3.0 | 30.0 | 0.0 | 0.007776185804686975 | 0.009135069977912141 | 0.03013703227043152 | 0.0002250532270409167 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.051641665 | 0.030256712999999998 | 0.13535246 | 0.012026708 | |||||||
| 25 | 24 | 1190.0 | 0.0 | 5815.0 | 5815.0 | 82.0 | 5815.0 | 0.9952895800001024 | 1.0 | 5.0 | 0.0 | 0.011108649816984931 | 0.010548835692936798 | 0.042301006615161896 | 0.00035882263910025364 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.06612081 | 0.03200974 | 0.15909933 | 0.027230294 | |||||||
| 26 | 25 | 1212.0 | 0.0 | 5904.0 | 5904.0 | 89.0 | 5904.0 | 0.9952014700001042 | 1.0 | 10.0 | 0.0 | 0.004195408909502227 | 0.007376560022200305 | 0.025709044188261032 | 0.00019116624025627968 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.03334951 | 0.028961857999999997 | 0.10913753 | 0.008333857 |