mirror of
https://github.com/gryf/coach.git
synced 2026-01-29 19:55:56 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
6.3 KiB
6.3 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 486.0 | 486.0 | 486.0 | 486.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 573.0 | 573.0 | 87.0 | 573.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 722.0 | 722.0 | 149.0 | 722.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 5 | 4 | 0.0 | 1.0 | 1057.0 | 1057.0 | 335.0 | 1057.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 6 | 5 | 51.0 | 0.0 | 1260.0 | 1260.0 | 203.0 | 1260.0 | 0.9997990300000044 | 5.0 | 55.0 | 0.0 | 0.011007826140734787 | 0.014603481057153478 | 0.05728723481297492 | 0.00017679380835033953 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.05001448 | 0.03673725 | 0.18864875 | 0.012782556 | |||||||
| 7 | 6 | 70.0 | 0.0 | 1335.0 | 1335.0 | 75.0 | 1335.0 | 0.999724780000006 | 2.0 | 15.0 | 0.0 | 0.011499390747447154 | 0.010961000063645872 | 0.0315467044711113 | 0.0005441936664283277 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.05759323400000001 | 0.024992667000000003 | 0.11028934 | 0.022131458 | |||||||
| 8 | 7 | 91.0 | 0.0 | 1422.0 | 1422.0 | 87.0 | 1422.0 | 0.9996386500000078 | 1.0 | 15.0 | 0.0 | 0.01132884354696476 | 0.013429386203264964 | 0.04364541172981262 | 0.000340703729307279 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.055466358 | 0.03408065 | 0.13973257 | 0.016833907 | |||||||
| 9 | 8 | 159.0 | 0.0 | 1693.0 | 1693.0 | 271.0 | 1693.0 | 0.9993703600000136 | 5.0 | 55.0 | 0.0 | 0.008705972996627245 | 0.01046594152738398 | 0.04259247332811356 | 0.00026405457174405456 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.04516968 | 0.027806934 | 0.12673344 | 0.0123512 | |||||||
| 10 | 9 | 201.0 | 0.0 | 1861.0 | 1861.0 | 168.0 | 1861.0 | 0.9992040400000172 | 3.0 | 50.0 | 0.0 | 0.005878299186449675 | 0.009437852632040231 | 0.04434124007821083 | 0.00013726821634918449 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.031320102999999995 | 0.02600094 | 0.118447885 | 0.008186511 | |||||||
| 11 | 10 | 279.0 | 0.0 | 2172.0 | 2172.0 | 311.0 | 2172.0 | 0.998896150000024 | 4.0 | 65.0 | 0.0 | 0.009727142530699404 | 0.012268936533653787 | 0.05482503771781922 | 0.00013224119902588427 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.045818377 | 0.03322848 | 0.13469987 | 0.006579738000000001 | |||||||
| 12 | 11 | 440.0 | 0.0 | 2815.0 | 2815.0 | 643.0 | 2815.0 | 0.9982595800000378 | 10.0 | 335.0 | 0.0 | 0.008651261723684205 | 0.01035218743544492 | 0.0460522323846817 | 0.00012582635099533943 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.042565465 | 0.027683503999999998 | 0.13492820000000003 | 0.0055641527 | 0.010471878 | 0.016772145 | 0.03140713 | -0.011288109 | |||
| 13 | 12 | 458.0 | 0.0 | 2888.0 | 2888.0 | 73.0 | 2888.0 | 0.9981873100000394 | 2.0 | 45.0 | 0.0 | 0.009507577900270313 | 0.010686165551315943 | 0.03972183912992477 | 0.0004233591898810118 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.04721937 | 0.025519945 | 0.11107015 | 0.016729604 | |||||||
| 14 | 13 | 478.0 | 0.0 | 2969.0 | 2969.0 | 81.0 | 2969.0 | 0.9981071200000412 | 0.0 | 0.0 | 0.0 | 0.006916543132683728 | 0.008256373671042505 | 0.02793971076607704 | 0.00034270941978320485 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.03785228 | 0.023063693 | 0.090955265 | 0.014428139 | -0.0006935441000000001 | 0.008459795 | 0.012772851000000002 | -0.010573764 | |||
| 15 | 14 | 532.0 | 0.0 | 3183.0 | 3183.0 | 214.0 | 3183.0 | 0.9978952600000456 | 4.0 | 50.0 | 0.0 | 0.0069841391733562975 | 0.009727930525240285 | 0.030958421528339383 | 0.00019982327648904172 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.03517378 | 0.028121071 | 0.09745253599999999 | 0.007200333000000001 | |||||||
| 16 | 15 | 551.0 | 0.0 | 3262.0 | 3262.0 | 79.0 | 3262.0 | 0.9978170500000474 | 2.0 | 15.0 | 0.0 | 0.008571460710587226 | 0.012479903064922505 | 0.043056368827819824 | 0.0001790421229088679 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.03861944 | 0.03349347 | 0.12367985 | 0.008007733000000001 | |||||||
| 17 | 16 | 626.0 | 0.0 | 3560.0 | 3560.0 | 298.0 | 3560.0 | 0.9975220300000538 | 6.0 | 145.0 | 0.0 | 0.00916061567428138 | 0.00985262607730402 | 0.03863134235143662 | 0.00022459866886492819 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.04501683 | 0.026303627000000003 | 0.10941381 | 0.010795511 | -0.007507985 | 0.015376373999999998 | 0.009678967 | -0.030438615 | |||
| 18 | 17 | 885.0 | 0.0 | 4598.0 | 4598.0 | 1038.0 | 4598.0 | 0.996494410000076 | 12.0 | 210.0 | 0.0 | 0.007690413830856681 | 0.009722273214587109 | 0.04484198987483978 | 0.00014540684060193598 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.040153995 | 0.028107608 | 0.13334101 | 0.007176003 | 0.01752627 | 0.018640606 | 0.05206203 | -0.006777686 | |||
| 19 | 18 | 903.0 | 0.0 | 4668.0 | 4668.0 | 70.0 | 4668.0 | 0.9964251100000776 | 2.0 | 15.0 | 0.0 | 0.008374075151925139 | 0.00685058487549806 | 0.01695725508034229 | 0.00013643733109347522 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.041769784 | 0.025388801000000003 | 0.0693266 | 0.006554181 | |||||||
| 20 | 19 | 924.0 | 0.0 | 4754.0 | 4754.0 | 86.0 | 4754.0 | 0.9963399700000796 | 2.0 | 35.0 | 0.0 | 0.006986255206616728 | 0.009138039512187723 | 0.029435122385621067 | 0.00023959197278600186 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.03765729 | 0.027741422999999998 | 0.09483457 | 0.009779892 | |||||||
| 21 | 20 | 988.0 | 0.0 | 5007.0 | 5007.0 | 253.0 | 5007.0 | 0.9960895000000848 | 1.0 | 5.0 | 0.0 | 0.008958109939518932 | 0.010352110018017849 | 0.04120930656790733 | 0.00026069089653901756 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.044280123 | 0.027352182000000003 | 0.10559346 | 0.010518369 | |||||||
| 22 | 21 | 1009.0 | 0.0 | 5092.0 | 5092.0 | 85.0 | 5092.0 | 0.9960053500000868 | 0.0 | 0.0 | 0.0 | 0.009729027376687596 | 0.007417915393123158 | 0.026410818099975586 | 0.0005171290249563754 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.05305024 | 0.022927448 | 0.11149109 | 0.019090652 | |||||||
| 23 | 22 | 1041.0 | 0.0 | 5221.0 | 5221.0 | 129.0 | 5221.0 | 0.9958776400000896 | 1.0 | 10.0 | 0.0 | 0.0065109121378554855 | 0.009705144564042924 | 0.04230054095387459 | 0.0002464384888298809 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.03857564 | 0.029652012999999998 | 0.13166025 | 0.008339129 | 0.022667855 | 0.029822525 | 0.073469676 | -0.01822071 | |||
| 24 | 23 | 1101.0 | 0.0 | 5461.0 | 5461.0 | 240.0 | 5461.0 | 0.9956400400000948 | 6.0 | 85.0 | 0.0 | 0.007464758036803688 | 0.008871554942636414 | 0.04159108921885489 | 9.95906739262864e-05 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.040631982999999997 | 0.027780425 | 0.1273076 | 0.0052087842 | |||||||
| 25 | 24 | 1167.0 | 0.0 | 5724.0 | 5724.0 | 263.0 | 5724.0 | 0.9953796700001004 | 1.0 | 25.0 | 0.0 | 0.007049951021944059 | 0.008145597856309148 | 0.029556380584836006 | 0.00024747068528085947 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.03756183 | 0.025077598 | 0.10530927 | 0.010631736000000001 | |||||||
| 26 | 25 | 1188.0 | 0.0 | 5808.0 | 5808.0 | 84.0 | 5808.0 | 0.995296510000102 | 0.0 | 0.0 | 0.0 | 0.008849835848585437 | 0.008812960520326749 | 0.028190467506647113 | 0.00018663128139451146 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.04469130599999999 | 0.028627316 | 0.10585028 | 0.008268119 |