mirror of
https://github.com/gryf/coach.git
synced 2026-01-09 07:14:19 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
7.4 KiB
7.4 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 269.0 | 269.0 | 269.0 | 269.0 | 7.0 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 0.0 | 1.0 | 531.0 | 531.0 | 262.0 | 531.0 | 8.0 | 0.0 | |||||||||||||||||||||
| 4 | 3 | 0.0 | 1.0 | 654.0 | 654.0 | 123.0 | 654.0 | 0.0 | 0.0 | |||||||||||||||||||||
| 5 | 4 | 0.0 | 1.0 | 1173.0 | 1173.0 | 519.0 | 1173.0 | 2.0 | 0.0 | |||||||||||||||||||||
| 6 | 5 | 100.0 | 0.0 | 1572.0 | 1572.0 | 399.0 | 1572.0 | 8.0 | 10.0 | 310.0 | 0.0 | 0.006223844418564113 | 0.008294042663087463 | 0.03155895695090294 | 9.275647607864812e-05 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.005592896 | 0.0043458864 | 0.015006336 | 0.0010827626 | |||||||
| 7 | 6 | 160.0 | 0.0 | 1812.0 | 1812.0 | 240.0 | 1812.0 | 4.0 | 7.0 | 130.0 | 0.0 | 0.008564803576276366 | 0.010985852447389552 | 0.04601012542843819 | 0.00010247386671835555 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0060127693999999995 | 0.005001401 | 0.020867711 | 0.0012939627 | |||||||
| 8 | 7 | 185.0 | 0.0 | 1914.0 | 1914.0 | 102.0 | 1914.0 | 8.0 | 2.0 | 15.0 | 0.0 | 0.005707002155832015 | 0.009575807855094808 | 0.03112555481493473 | 0.00017016606580000368 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0047362293 | 0.004079403 | 0.014057411 | 0.0019318176000000002 | |||||||
| 9 | 8 | 229.0 | 0.0 | 2090.0 | 2090.0 | 176.0 | 2090.0 | 7.0 | 2.0 | 15.0 | 0.0 | 0.00864216006702072 | 0.013155937701958298 | 0.061171818524599075 | 0.00018360336252953854 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0063004736 | 0.0053449036 | 0.02487476 | 0.0019890803 | |||||||
| 10 | 9 | 244.0 | 0.0 | 2149.0 | 2149.0 | 59.0 | 2149.0 | 5.0 | 2.0 | 15.0 | 0.0 | 0.006669382058524727 | 0.0073984037571708594 | 0.015499015338718891 | 0.0002397242496954277 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.0056597376 | 0.0035718853999999996 | 0.0101093175 | 0.0022593176 | |||||||
| 11 | 10 | 267.0 | 0.0 | 2239.0 | 2239.0 | 90.0 | 2239.0 | 3.0 | 2.0 | 35.0 | 0.0 | 0.0062163660673515714 | 0.008715943338641199 | 0.030871812254190445 | 0.00019301722932141277 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0052448297 | 0.003933181500000001 | 0.013718200000000002 | 0.0018610907000000002 | |||||||
| 12 | 11 | 314.0 | 0.0 | 2430.0 | 2430.0 | 191.0 | 2430.0 | 8.0 | 3.0 | 30.0 | 0.0 | 0.005064984451622722 | 0.009464219831212449 | 0.04566681012511253 | 0.00017265054339077324 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0042512245 | 0.0038818533000000003 | 0.016594999 | 0.0016967895000000002 | 0.027018031 | 0.010441347 | 0.0474995 | 0.016540313 | |||
| 13 | 12 | 334.0 | 0.0 | 2508.0 | 2508.0 | 78.0 | 2508.0 | 0.0 | 2.0 | 45.0 | 0.0 | 0.009793104943941887 | 0.011249256480204521 | 0.030716722831130024 | 0.0001681474968791008 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.006330653 | 0.0047096196 | 0.014098721000000002 | 0.0016581904999999999 | |||||||
| 14 | 13 | 378.0 | 0.0 | 2684.0 | 2684.0 | 176.0 | 2684.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.01011617302819187 | 0.013193202268065932 | 0.04607892408967018 | 0.00014670072414446622 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0067802290000000005 | 0.005344481 | 0.022279864 | 0.0016253429999999998 | 0.027286835 | 0.0073864055 | 0.03329719 | 0.016612418 | |||
| 15 | 14 | 425.0 | 0.0 | 2872.0 | 2872.0 | 188.0 | 2872.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00757620267153896 | 0.010181129044846313 | 0.03055970929563045 | 0.00019950376008637252 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.005839123000000001 | 0.004426101 | 0.01861591 | 0.0021889664000000002 | |||||||
| 16 | 15 | 449.0 | 0.0 | 2967.0 | 2967.0 | 95.0 | 2967.0 | 6.0 | 3.0 | 30.0 | 0.0 | 0.0070000348653896545 | 0.011226164764985254 | 0.04456117376685143 | 0.00021107358043082056 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.005659159 | 0.0046324306 | 0.021839907000000002 | 0.0021402768 | |||||||
| 17 | 16 | 469.0 | 0.0 | 3049.0 | 3049.0 | 82.0 | 3049.0 | 3.0 | 0.0 | 0.0 | 0.0 | 0.012940272329433357 | 0.009718707321816513 | 0.030684769153594967 | 0.0003000017604790628 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.008273507 | 0.0035852243 | 0.013593527 | 0.0028905305 | |||||||
| 18 | 17 | 533.0 | 0.0 | 3306.0 | 3306.0 | 257.0 | 3306.0 | 1.0 | 5.0 | 55.0 | 0.0 | 0.007539691670444881 | 0.009121751516234046 | 0.03060857020318508 | 0.00022119178902357817 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.00609803 | 0.0035867486 | 0.013320565 | 0.002192039 | |||||||
| 19 | 18 | 585.0 | 0.0 | 3511.0 | 3511.0 | 205.0 | 3511.0 | 7.0 | 0.0 | 0.0 | 0.0 | 0.005838182990213253 | 0.007284520898622485 | 0.01573643647134304 | 0.00017755883163772523 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0050161253 | 0.0035143814 | 0.010200171999999999 | 0.0018448817 | |||||||
| 20 | 19 | 632.0 | 0.0 | 3701.0 | 3701.0 | 190.0 | 3701.0 | 5.0 | 2.0 | 25.0 | 0.0 | 0.006030775471020767 | 0.008624205468393242 | 0.030819704756140712 | 0.00014619529247283936 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0049460824 | 0.004009163499999999 | 0.014025801000000001 | 0.0014488354999999998 | |||||||
| 21 | 20 | 680.0 | 0.0 | 3891.0 | 3891.0 | 190.0 | 3891.0 | 4.0 | 0.0 | 0.0 | 0.0 | 0.0062177468741235024 | 0.009185724365931972 | 0.030945468693971637 | 0.00016168373986147344 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0048525543 | 0.0041472296999999995 | 0.014570421 | 0.0015842235999999998 | |||||||
| 22 | 21 | 729.0 | 0.0 | 4090.0 | 4090.0 | 199.0 | 4090.0 | 9.0 | 4.0 | 50.0 | 0.0 | 0.0066686894893717525 | 0.009633275859106637 | 0.03070710971951485 | 0.00016243076242972163 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.005217642 | 0.0044929385 | 0.01947369 | 0.00170155 | 0.021158978 | 0.011703089 | 0.040702187 | 0.0043722745 | |||
| 23 | 22 | 804.0 | 0.0 | 4390.0 | 4390.0 | 300.0 | 4390.0 | 9.0 | 5.0 | 60.0 | 0.0 | 0.007745159575574075 | 0.010264969595287615 | 0.0457281582057476 | 0.0001735425612423569 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.005858070999999999 | 0.004608556 | 0.02206757 | 0.0016346257 | |||||||
| 24 | 23 | 862.0 | 0.0 | 4619.0 | 4619.0 | 229.0 | 4619.0 | 6.0 | 3.0 | 30.0 | 0.0 | 0.0073957111507249795 | 0.010238454419935428 | 0.045198917388916016 | 0.00018422666471451518 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.005702048 | 0.0044779684 | 0.019911936 | 0.0017970852 | |||||||
| 25 | 24 | 882.0 | 0.0 | 4699.0 | 4699.0 | 80.0 | 4699.0 | 6.0 | 0.0 | 0.0 | 0.0 | 0.007019620326900622 | 0.010097406329511898 | 0.030556553974747658 | 0.00021815481886733326 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0058081313 | 0.0048652836 | 0.019417763 | 0.0023302154 | |||||||
| 26 | 25 | 945.0 | 0.0 | 4951.0 | 4951.0 | 252.0 | 4951.0 | 5.0 | 4.0 | 50.0 | 0.0 | 0.005672615208563262 | 0.007662691003304536 | 0.02989559806883335 | 0.00017223272880073634 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.004905078 | 0.0036695688 | 0.013604815 | 0.0018130213 | 0.031090358 | 0.004259917 | 0.037605517000000005 | 0.025178626000000003 | |||
| 27 | 26 | 995.0 | 0.0 | 5152.0 | 5152.0 | 201.0 | 5152.0 | 8.0 | 2.0 | 15.0 | 0.0 | 0.006299167421530001 | 0.008046825175071413 | 0.030643418431282043 | 0.00017975828086491674 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.005176335 | 0.003915025999999999 | 0.014020402 | 0.0017273452 | 0.026928194 | 0.009660842 | 0.041887067 | 0.010047999 | |||
| 28 | 27 | 1017.0 | 0.0 | 5242.0 | 5242.0 | 90.0 | 5242.0 | 1.0 | 2.0 | 35.0 | 0.0 | 0.005731220244011969 | 0.008559611287002292 | 0.03012747503817081 | 0.00019501463975757358 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0051502184999999995 | 0.004566707 | 0.01945674 | 0.0021261019 | 0.02149797 | 0.010033373 | 0.03370452 | 0.008069546 | |||
| 29 | 28 | 1072.0 | 0.0 | 5462.0 | 5462.0 | 220.0 | 5462.0 | 3.0 | 2.0 | 55.0 | 0.0 | 0.008864355713336004 | 0.011839999879392651 | 0.046134039759635925 | 0.0001610755716683343 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.0062677735 | 0.0050042396999999995 | 0.020127073 | 0.0017192016 | 0.02061991 | 0.0075885756 | 0.034549624 | 0.010614768 | |||
| 30 | 29 | 1118.0 | 0.0 | 5646.0 | 5646.0 | 184.0 | 5646.0 | 4.0 | 3.0 | 20.0 | 0.0 | 0.0072352889821761185 | 0.008737181206568531 | 0.030466778203845024 | 0.00018102813919540492 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.005792571999999999 | 0.004318339 | 0.019932609 | 0.0020496019 | |||||||
| 31 | 30 | 1175.0 | 0.0 | 5874.0 | 5874.0 | 228.0 | 5874.0 | 4.0 | 5.0 | 85.0 | 0.0 | 0.008313204468972149 | 0.011042285375825469 | 0.045756932348012924 | 0.00019610222079791129 | 0.00025 | 0.0 | 0.00025 | 0.00025 | 0.005883634 | 0.0046995464 | 0.022168385 | 0.001801039 | 0.0073951124 | 0.010906539 | 0.025537572999999997 | -0.0070570237 |