mirror of
https://github.com/gryf/coach.git
synced 2026-01-30 12:15:49 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
5.0 KiB
5.0 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 486.0 | 486.0 | 486.0 | 486.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 573.0 | 573.0 | 87.0 | 573.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 722.0 | 722.0 | 149.0 | 722.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 5 | 4 | 0.0 | 1.0 | 1057.0 | 1057.0 | 335.0 | 1057.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 6 | 5 | 51.0 | 0.0 | 1260.0 | 1260.0 | 203.0 | 1260.0 | 0.999817300000006 | 5.0 | 55.0 | 0.0 | 0.011153367854907023 | 0.015027035375515356 | 0.0589878261089325 | 8.134254312608391e-05 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.04883443 | 0.041394312 | 0.19476563 | 0.006623836700000001 | |||||||
| 7 | 6 | 70.0 | 0.0 | 1335.0 | 1335.0 | 75.0 | 1335.0 | 0.9997498000000082 | 2.0 | 15.0 | 0.0 | 0.01147703626683276 | 0.011325970191239728 | 0.030307751148939133 | 0.0004829707322642207 | 0.0001 | 0.0 | 0.0001 | 0.0001 | 0.056145836 | 0.026853915 | 0.10600656 | 0.024553476 | |||||||
| 8 | 7 | 91.0 | 0.0 | 1422.0 | 1422.0 | 87.0 | 1422.0 | 0.9996715000000108 | 1.0 | 15.0 | 0.0 | 0.011548659179381849 | 0.013730809899124057 | 0.043242335319519036 | 0.0003053410327993333 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.059877775999999994 | 0.03687902 | 0.13717306 | 0.020375967 | |||||||
| 9 | 8 | 159.0 | 0.0 | 1693.0 | 1693.0 | 271.0 | 1693.0 | 0.9994276000000188 | 5.0 | 55.0 | 0.0 | 0.008716323934434634 | 0.010617909178954029 | 0.0430009663105011 | 0.00019797885033767668 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.04758673 | 0.03171387 | 0.13769212 | 0.011469088999999998 | |||||||
| 10 | 9 | 201.0 | 0.0 | 1861.0 | 1861.0 | 168.0 | 1861.0 | 0.9992764000000238 | 3.0 | 50.0 | 0.0 | 0.005772166230252921 | 0.009313748713790294 | 0.04269432276487351 | 0.0001282592274947092 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.034678657 | 0.027940277000000003 | 0.1290576 | 0.008978493 | |||||||
| 11 | 10 | 279.0 | 0.0 | 2172.0 | 2172.0 | 311.0 | 2172.0 | 0.9989965000000329 | 4.0 | 65.0 | 0.0 | 0.00966654376334657 | 0.012455670221574441 | 0.05754233151674271 | 6.935953570064156e-05 | 0.00010000000000000003 | 4.0657581468206416e-20 | 0.0001 | 0.0001 | 0.048503987 | 0.037041757 | 0.1568195 | 0.006075088 | |||||||
| 12 | 11 | 406.0 | 0.0 | 2681.0 | 2681.0 | 509.0 | 2681.0 | 0.998538400000048 | 9.0 | 320.0 | 0.0 | 0.008286195846002249 | 0.010238034143861922 | 0.044422760605812066 | 0.00014633702812716365 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.04592845 | 0.030503508 | 0.14186455 | 0.0110354535 | 0.03322521 | 0.015940087 | 0.056609314 | 0.0076417234 | |||
| 13 | 12 | 471.0 | 0.0 | 2941.0 | 2941.0 | 260.0 | 2941.0 | 0.9983044000000558 | 7.0 | 110.0 | 0.0 | 0.009545503650756123 | 0.012811965745234155 | 0.05728550255298615 | 0.00014954243670217693 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.04835386 | 0.03174918 | 0.15212412 | 0.012202092 | |||||||
| 14 | 13 | 506.0 | 0.0 | 3082.0 | 3082.0 | 141.0 | 3082.0 | 0.99817750000006 | 0.0 | 0.0 | 0.0 | 0.007986091597038987 | 0.010227653459912332 | 0.029147621244192123 | 0.00018407590687274933 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.043641995999999995 | 0.03191385 | 0.11431264 | 0.012298575 | |||||||
| 15 | 14 | 569.0 | 0.0 | 3331.0 | 3331.0 | 249.0 | 3331.0 | 0.9979534000000674 | 7.0 | 110.0 | 0.0 | 0.007928447468833427 | 0.009549152818479286 | 0.04228781163692474 | 0.00013459150795824826 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.04425544 | 0.03010041 | 0.13962811 | 0.0104794 | |||||||
| 16 | 15 | 655.0 | 0.0 | 3677.0 | 3677.0 | 346.0 | 3677.0 | 0.9976420000000776 | 0.0 | 0.0 | 0.0 | 0.009116127458434866 | 0.010233341520914018 | 0.03874828293919563 | 0.00016095259343273938 | 9.999999999999998e-05 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.05015296 | 0.030124526000000002 | 0.13494284 | 0.012318888 | 0.036954846 | 0.016541163 | 0.06137629 | 0.0047242693 | |||
| 17 | 16 | 674.0 | 0.0 | 3753.0 | 3753.0 | 76.0 | 3753.0 | 0.99757360000008 | 1.0 | 10.0 | 0.0 | 0.005457259488665793 | 0.006725242317819629 | 0.015414755791425703 | 0.00018984619237016886 | 0.0001 | 0.0 | 0.0001 | 0.0001 | 0.036532152000000005 | 0.022897648 | 0.06911852 | 0.013080008999999998 | |||||||
| 18 | 17 | 723.0 | 0.0 | 3948.0 | 3948.0 | 195.0 | 3948.0 | 0.9973981000000856 | 0.0 | 0.0 | 0.0 | 0.006345583730150366 | 0.01018264769961162 | 0.0397307351231575 | 9.957021393347532e-05 | 0.00010000000000000002 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.03886488 | 0.038184277999999995 | 0.14176919 | 0.006970413000000001 | |||||||
| 19 | 18 | 754.0 | 0.0 | 4073.0 | 4073.0 | 125.0 | 4073.0 | 0.9972856000000894 | 2.0 | 15.0 | 0.0 | 0.006026781925503465 | 0.007350232007724398 | 0.026275455951690674 | 0.00017769451369531453 | 9.999999999999996e-05 | 4.0657581468206416e-20 | 0.0001 | 0.0001 | 0.043348733 | 0.027256972999999997 | 0.12586276 | 0.011062483 | |||||||
| 20 | 19 | 831.0 | 0.0 | 4381.0 | 4381.0 | 308.0 | 4381.0 | 0.9970084000000984 | 4.0 | 90.0 | 0.0 | 0.006534161396810706 | 0.009079722555863916 | 0.039741791784763336 | 7.733125676168129e-05 | 0.00010000000000000003 | 2.7105054312137605e-20 | 0.0001 | 0.0001 | 0.042101186 | 0.03250519 | 0.15925613 | 0.0062198965000000005 | |||||||
| 21 | 20 | 933.0 | 0.0 | 4789.0 | 4789.0 | 408.0 | 4789.0 | 0.9966412000001106 | 5.0 | 35.0 | 0.0 | 0.007308699939826244 | 0.009906021263269732 | 0.05427439138293266 | 0.00011632483074208723 | 0.0001 | 1.3552527156068802e-20 | 0.0001 | 0.0001 | 0.04225955 | 0.031486627 | 0.17579086 | 0.0077312537 | 0.026419535 | 0.01425859 | 0.05756532 | 0.014857713 |