1
0
mirror of https://github.com/gryf/coach.git synced 2026-03-23 11:03:32 +01:00

Itaicaspi/episode reset refactoring (#105)

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
This commit is contained in:
Itai Caspi
2018-09-04 15:07:54 +03:00
committed by GitHub
parent 7086492127
commit 72a1d9d426
92 changed files with 9803 additions and 9740 deletions

View File

@@ -5,27 +5,22 @@ Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,To
4,0.0,1.0,187.0,1.0,187.0,646.0,0.0,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,0.0,1.0,86.0,1.0,86.0,732.0,0.0,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
6,0.0,1.0,331.0,1.0,331.0,1063.0,0.0,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
7,15.0,0.0,283.0,1.0,283.0,1346.0,0.0,4.0,60.0,0.0,,,,,,,,,,,,0.19038728,0.37413179999999996,1.085708,0.0005235527,1.7916272,5.3056545e-05,1.7917521,1.7915139,0.7594709227977354,0.7875483274351753,2.654857397079468,-0.01272570714354515,0.01685313,0.026051456,0.06775641,-0.010952383999999999,0.15764348,0.3329968,1.0256157,1.0381776e-06,0.39576545,0.7969487,2.2923334,-0.015413323999999999
8,28.0,0.0,260.0,1.0,260.0,1606.0,0.0,1.0,5.0,0.0,,,,,,,,,,,,0.04208376,0.116426505,0.44519386,0.005088003,1.7914166,9.428363000000001e-05,1.7916368000000003,1.791266,0.03226857285548017,0.1964720915841794,0.985376238822937,-0.03431656211614609,0.10270161,0.014737441,0.123610094,0.014027433,0.02014357,0.06890331,0.25883156,0.00014790757,0.050990395,0.24849062,0.9112610999999999,-0.04358855
9,41.0,0.0,258.0,1.0,258.0,1864.0,0.0,3.0,50.0,0.0,,,,,,,,,,,,0.15317139,0.29834458,0.88614666,0.0067990692000000005,1.7912786,0.00014624497,1.7915285,1.7909831000000003,0.11791496871469112,0.3335888662422885,0.9964320659637452,-0.11552873998880385,0.14228675,0.02709809,0.19294438,0.023462784,0.06762212,0.14083381,0.3970459,0.0002949245,0.21089181,0.5367635999999999,1.4676753999999999,-0.16865353
10,60.0,0.0,379.0,1.0,379.0,2243.0,0.0,2.0,15.0,0.0,,,,,,,,,,,,0.13247547,0.28383696,0.9851553,0.017279061999999998,1.7896857,0.0005040495,1.7909534,1.7888476999999998,0.03593501462428658,0.25046052313640216,0.9846256971359252,-0.20018021762371066,0.30096403,0.05548007,0.4147089,0.039941877,0.039800237999999995,0.10806426400000001,0.35698512,0.0011615867999999999,0.06293697,0.38049793,1.1996851,-0.2838259
11,68.0,0.0,155.0,1.0,155.0,2398.0,0.0,4.0,90.0,0.0,,,,,,,,,,,,0.366822,0.27297008,0.85500246,0.056615636,1.7862365000000002,0.0014941338,1.7889153999999998,1.7834325,0.07147985940459389,0.3777481010142403,0.9432406425476074,-0.4365565776824951,0.5416776,0.08939341,0.6384013000000001,0.08678746,0.120208606,0.1047605,0.29189906,0.005644179399999999,0.12151067,0.40066546,0.74686474,-0.6290473000000001
12,72.0,0.0,63.0,1.0,63.0,2461.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.380458,0.5025457,1.2508503,0.0845396,1.7842273000000002,0.00016402099999999998,1.7845758000000003,1.7839825,-0.15644778052965802,0.19377771679160247,-0.006263315677642822,-0.6633321642875671,0.66913104,0.10687629,0.72362936,0.13550685,0.043720815,0.057323013,0.143007,0.010515749,-0.31537065,0.3354341,-0.11016506,-0.8962283999999999
13,92.0,0.0,395.0,1.0,395.0,2856.0,0.0,7.0,80.0,0.0,,,,,,,,,,,,0.7341813,0.8931361999999999,3.4697776,0.08773655,1.7756039000000001,0.007125674,1.7904316999999998,1.7572613999999998,0.0531010530869581,0.35287025941667083,0.9526851177215576,-0.43425774574279785,1.2777001000000001,0.4850268,2.382248,0.14852436,0.15513632,0.17681737,0.525007,0.012881841000000002,0.07083849,0.47545427,1.1537104,-0.4574643
14,99.0,0.0,136.0,1.0,136.0,2992.0,0.0,3.0,35.0,0.0,,,,,,,,,,,,2.3274047,3.6074872000000004,11.039786,0.282328,1.7519287000000001,0.004642716,1.7601723999999999,1.7432127,-0.3301314194997152,0.7044738896059562,1.4541268348693848,-2.5556342601776123,2.4971607000000002,0.26111710000000005,2.6487439999999998,0.5862286999999999,0.53202635,0.7275212,2.252977,0.09076643,-0.6792965999999999,1.2040575,0.35570467,-3.540179
15,103.0,0.0,72.0,1.0,72.0,3064.0,0.0,2.0,40.0,0.0,,,,,,,,,,,,4.007548,3.65184,10.018075,0.15519248,1.7376648999999997,0.0027116186,1.7455993000000003,1.7320668000000001,-0.2714320342791708,0.9907989894898622,0.8220000267028809,-2.4204037189483643,2.1503487,0.2635579,2.4204037,0.5985670999999999,0.77588123,0.7991889,2.1323647,0.06468646,-0.646171,1.7660096999999997,0.74871886,-3.6078562999999995
16,116.0,0.0,258.0,1.0,258.0,3322.0,0.0,5.0,75.0,0.0,,,,,,,,,,,,1.6523731,2.4698055,8.083523,0.13003878,1.7277107,0.0052154507,1.7423832,1.7133006999999998,-0.039007020375085265,0.6065334628968307,1.6601608991622925,-2.028516292572021,2.0747602,0.20532979,2.3914117999999998,0.51776755,0.3199724,0.44089985,1.3816912000000001,0.045208815,-0.14023370000000002,1.0421013000000001,1.9057945,-2.6204767
17,126.0,0.0,187.0,1.0,187.0,3509.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.92402875,2.2525914,7.679933500000001,0.10234933,1.6913862000000002,0.00811164,1.7129078000000002,1.6787053,-0.3290705545382066,0.4870754997377705,-0.00916290283203125,-2.0534813404083248,1.8546785,0.17576595,2.0647898000000002,0.58214456,0.20426781,0.49021757,1.6748041,0.0308303,-0.6092503,0.87516916,-0.23161256,-3.2324522
18,142.0,0.0,304.0,1.0,304.0,3813.0,0.0,8.0,135.0,0.0,,,,,,,,,,,,1.0899202,1.2234477,4.604091599999999,0.09670285,1.6710208999999998,0.017265387,1.7212131000000002,1.6353879999999998,0.038843880938396386,0.4414781879982783,1.6555359363555908,-1.5750807523727417,1.5963658,0.12633687,1.7319156999999998,0.49549833,0.20850252,0.22034019,0.7669623999999999,0.024377807999999997,-0.022250907,0.6641187,1.1199272,-1.9804106
19,154.0,0.0,229.0,1.0,229.0,4042.0,0.0,3.0,30.0,0.0,,,,,,,,,,,,1.2594553,1.3803116000000002,3.5145763999999997,0.16982433,1.5893492,0.020280475,1.7202836,1.5471966000000001,0.018158039344208586,0.4246180314783785,0.9802966117858888,-1.4079413414001465,1.9309628,0.13179472,2.00963,0.55298984,0.16962524,0.19235095,0.5845739000000001,0.036804248,-0.12052882,0.73935986,1.0234607,-1.7943251
20,159.0,0.0,94.0,1.0,94.0,4136.0,0.0,2.0,45.0,0.0,,,,,,,,,,,,2.433933,2.4023757000000003,6.9298470000000005,0.361035,1.5338767,0.009024621,1.5852808,1.5249511,-0.07727590973457593,0.6763922700521872,0.9142348766326904,-2.0492594242095947,1.9556553,0.17559198,2.0944297,0.6989957,0.43858927,0.53540176,1.4796344,0.038060218,-0.40701088,1.2666613000000002,0.9898828000000002,-2.7505924999999998
21,167.0,0.0,160.0,1.0,160.0,4296.0,0.0,1.0,30.0,0.0,,,,,,,,,,,,1.1918872999999999,2.3961012000000004,7.528841000000001,0.21041411,1.5789443,0.032899573,1.6285396,1.5285223999999997,-0.3033659016931212,0.4753093769161661,0.6735560894012451,-1.957652568817139,1.7269223999999999,0.17983046,1.9576526,0.65285933,0.23161101,0.49035895,1.5254883,0.02308034,-0.5342293,0.74790174,-0.17959394,-2.5103128
22,177.0,0.0,196.0,1.0,196.0,4492.0,0.0,6.0,50.0,0.0,,,,,,,,,,,,1.9176056000000001,1.4854782,5.182706,0.1296832,1.600741,0.007108025699999999,1.6213428,1.5850583,0.13868192747235294,0.6509546583268273,0.9652938842773438,-1.4699276685714722,1.5411451999999999,0.12013785,1.6934890999999999,0.48539475,0.29114074,0.25261027,0.9247953999999999,0.02226338,0.20969613,0.9626545,1.3314689,-2.1836848
23,181.0,0.0,64.0,1.0,64.0,4556.0,0.0,2.0,15.0,0.0,,,,,,,,,,,,1.7134898,1.7741624,4.7404027,0.31171557,1.5784401000000001,0.0042786077,1.5924774,1.5698988,-0.2495364295808893,0.7244287028836092,0.8197289705276489,-1.6387592554092407,1.5626915000000001,0.14662078,1.6555089,0.5695728,0.39968159999999997,0.42479333,1.1206775,0.023818979,-0.60248417,1.3273264,0.47739142,-2.8594353
24,190.0,0.0,173.0,1.0,173.0,4729.0,0.0,2.0,35.0,0.0,,,,,,,,,,,,0.86420363,1.1505579,3.1139815,0.21101475,1.5583048000000002,0.017154397,1.6805171000000003,1.543207,-0.04846956090229314,0.3673147151981751,1.5572770833969116,-1.3467092514038086,1.4270504,0.10279469999999999,1.5661808000000002,0.55668885,0.13428058,0.21433474,0.55084944,0.018585313,-0.23329782,0.7188769,1.2898462,-1.7598825
25,205.0,0.0,287.0,1.0,287.0,5016.0,0.0,1.0,15.0,0.0,,,,,,,,,,,,0.62027776,1.1781626,5.0210605,0.19690849,1.514342,0.03201414,1.5717763,1.4700277,-0.17472102851591012,0.2584962285741155,0.7259573936462402,-1.389548659324646,1.372163,0.19543384,1.6445271000000001,0.49026695,0.07691100000000001,0.1892716,0.77725166,0.012039252,-0.2952007,0.3808548,-0.022987355,-1.7028612
26,213.0,0.0,148.0,1.0,148.0,5164.0,0.0,1.0,10.0,0.0,,,,,,,,,,,,0.87788093,1.1205411,3.2541595,0.16302086,1.6190518999999999,0.027872879,1.6517131,1.5822551,-0.02111946441689316,0.3623434091799364,0.9680911898612976,-1.0958757400512695,0.85739833,0.11593313,1.0958757,0.43242526,0.10100008,0.15961096,0.4262519,0.0066352487,-0.1389226,0.54876155,0.983246,-1.2080108999999999
27,226.0,0.0,252.0,1.0,252.0,5416.0,0.0,7.0,80.0,0.0,,,,,,,,,,,,0.5659744000000001,0.5525852,1.7838180000000001,0.114572525,1.6545971999999998,0.01972044,1.6865756999999997,1.6199006,0.12287720557182065,0.4328559747238113,1.8018977642059328,-0.7886770963668823,0.6744036999999999,0.0695462,0.78889847,0.3162281,0.11731172,0.14342302,0.47155509999999995,0.0039760494,0.15986028,0.57722294,1.304932,-1.0839018999999999
28,235.0,0.0,172.0,1.0,172.0,5588.0,0.0,3.0,30.0,0.0,,,,,,,,,,,,0.5754373,0.53583986,1.652655,0.20605606,1.6293853999999999,0.009444112,1.6664357,1.6154865,0.022745465989722758,0.37979156365269456,0.9582107663154602,-0.7535422444343567,0.760279,0.04507102,0.8029374,0.30640987,0.08912065,0.09862131,0.26244953,0.007173898000000001,-0.031205913,0.53349096,0.8757472,-1.2156468999999999
29,247.0,0.0,222.0,1.0,222.0,5810.0,0.0,5.0,50.0,0.0,,,,,,,,,,,,0.58918446,0.6646291999999999,2.1683297,0.1408544,1.6602775,0.023379487999999997,1.6966393,1.6148347,0.06360590496453745,0.4046164354742885,1.7265342473983765,-0.8571033477783203,0.60714984,0.087587655,0.85710335,0.35827366,0.09666201,0.123577625,0.3676509,0.00446005,0.06519471,0.5444627,1.2408435,-1.1376678
30,257.0,0.0,194.0,1.0,194.0,6004.0,0.0,2.0,40.0,0.0,,,,,,,,,,,,0.39171672,0.37535542,1.2922283,0.14345266,1.6632646,0.020220451,1.7691753000000001,1.6358597,0.057352830338609086,0.3057737359186611,0.9528021216392516,-0.3989834487438202,0.5558522,0.04560928,0.62815917,0.24490662,0.05019971,0.07826413,0.22374696,0.0043086787,0.024740081,0.38840258,0.7919935,-0.59568876
7,37.0,0.0,753.0,1.0,753.0,1816.0,0.0,18.0,275.0,0.0,,,,,,,,,,,,0.19276376,0.24904153,0.8257747,0.00013455142,1.7914727,0.00030057304,1.7917566000000003,1.7904335,0.2119369695934143,0.4029896256601249,1.8961015939712524,-0.038109242916107185,0.059407155999999996,0.06025871,0.20562454,-0.0059952493999999995,0.10879397,0.13543734,0.42884704,1.0928144400000001e-07,0.37946862,0.4819447,1.3135536,-0.039071497000000004
8,43.0,0.0,107.0,1.0,107.0,1923.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.015239561,0.0024996148,0.019516254,0.011843238999999998,1.7906417000000001,0.0002779353,1.7917081999999998,1.790199,-0.027609226256608964,0.013555497071717059,-0.002493098378181457,-0.054812923073768616,0.23526458,0.013103152,0.25880286,0.2149402,0.0010658596000000002,4.0590476e-05,0.0011093322,0.0010099161,-0.04934317,0.011472803,-0.039747406,-0.06998761
9,47.0,0.0,73.0,1.0,73.0,1996.0,0.0,1.0,25.0,0.0,,,,,,,,,,,,0.36924744,0.48345366,1.0529541999999998,0.027382427999999997,1.7904778000000001,0.00029410556,1.7917029999999998,1.7902383999999998,0.2182698796192805,0.4074832854303497,0.9812830686569214,-0.05661928653717041,0.29025623,0.01401644,0.3092764,0.27106556,0.12317172,0.1714658,0.36566097,0.0019109361999999999,0.39213285,0.6270856,1.2789414,-0.0569774
10,60.0,0.0,251.0,1.0,251.0,2247.0,0.0,4.0,30.0,0.0,,,,,,,,,,,,0.43863640000000004,0.85959023,2.5217259999999997,0.035312783,1.7856493000000002,0.0015894786999999999,1.7916044999999998,1.7811941999999998,0.11047004585464797,0.4511292058995869,1.8077605962753296,-0.14487385749816895,0.61056226,0.07753438,0.79911625,0.4961661,0.1471402,0.31318584,0.8826119,0.0050534373,0.1947745,0.6780483,1.7730703,-0.1461431
11,66.0,0.0,121.0,1.0,121.0,2368.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.14871785,0.02547766,0.19108213,0.12211705,1.7780668000000002,0.0027863213,1.7913043000000002,1.7759546000000002,-0.10607111632823944,0.05572927975960492,-0.009624600410461426,-0.19753050804138186,1.0220673,0.034096994,1.0672632,0.95422196,0.021255326,0.0012313497,0.023416747999999998,0.019953651,-0.18968049,0.016377756,-0.16716708,-0.21171494
12,71.0,0.0,99.0,1.0,99.0,2467.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.15555012,0.04610389,0.23511228,0.12260077,1.7795043000000001,0.0023333197,1.7912293999999997,1.7773554,-0.09951297342777253,0.055938959522401716,-0.003077983856201172,-0.1911371946334839,1.0086769,0.036691166000000004,1.0868968,0.96416116,0.02108366,0.0023025707,0.02351839,0.018744798,-0.17643596,0.02231863,-0.14677188,-0.20941082
13,98.0,0.0,534.0,1.0,534.0,3001.0,0.0,13.0,340.0,0.0,,,,,,,,,,,,0.7437658,1.1363528999999999,3.4128056000000004,0.025902914,1.7773186,0.003927723,1.7911369,1.7642651000000003,0.1266731635882304,0.41158095902457936,1.7730363607406616,-0.30260801315307617,0.9577929000000001,0.23765793,1.6504846999999998,0.7122954,0.15221074,0.20607093,0.757879,0.008616385,0.22623166,0.6171821,1.5555726,-0.30132666
14,102.0,0.0,73.0,1.0,73.0,3074.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.21133159,0.027693717000000003,0.24578243,0.17797336,1.7719979,0.004124344,1.7904103,1.7675488999999998,-0.18784045577049252,0.09607193868630864,-0.01820123195648193,-0.3348844051361084,1.8157914000000002,0.008718997,1.8383793000000002,1.7924569,0.056422543,0.0016911370000000001,0.05787156,0.054050256,-0.33337316,0.013935573,-0.3163426,-0.35047740000000005
15,110.0,0.0,156.0,1.0,156.0,3230.0,0.0,2.0,35.0,0.0,,,,,,,,,,,,0.62405765,0.9592649999999999,2.9572167,0.12969549,1.7828823,0.0024880429999999997,1.7908226999999999,1.7776691000000002,-0.06490173169544765,0.3036005258432666,1.5683243274688718,-0.3259851932525635,1.7211465,0.029713133,1.7807939999999998,1.678612,0.12291474,0.17745444,0.5574187,0.04589054,-0.112290025,0.43601707,0.9523574,-0.32423943
16,122.0,0.0,238.0,1.0,238.0,3468.0,0.0,1.0,10.0,0.0,,,,,,,,,,,,0.15554756,0.054277197,0.29741687,0.08846552,1.7868587,0.0007967119500000001,1.7910342000000001,1.7857143,-0.1573395555669611,0.12615047772660626,0.7546746730804443,-0.3632258176803589,1.6729587,0.17595315,1.9944772000000002,1.502953,0.05232538,0.035234198,0.15977861,0.03124919,-0.28081256,0.079151474,-0.07227526599999999,-0.35482806
17,132.0,0.0,185.0,1.0,185.0,3653.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.08596161,0.024157293,0.13088742,0.04359092,1.7865810000000002,0.0008728278999999999,1.7914173999999998,1.7849907999999999,-0.1145426501830419,0.06364297653654805,-0.008004307746887207,-0.26171672344207764,1.1001204,0.17392394,1.4450287,0.87743527,0.016055183,0.005272877,0.027343987,0.009870462,-0.20479403,0.033908524,-0.15814927,-0.26577490000000004
18,156.0,0.0,469.0,1.0,469.0,4122.0,0.0,7.0,300.0,0.0,,,,,,,,,,,,0.44079409999999997,0.6836350999999999,2.4870617000000004,0.025862668,1.7806060000000001,0.0046418053,1.7915317,1.7732808999999998,0.062453726452329876,0.3429879814155853,0.985063374042511,-0.20165133476257324,0.92002445,0.10120275599999999,1.1328447,0.7651458999999999,0.091498,0.13105219999999998,0.42509866,0.007523017,0.11129719,0.47185699999999997,1.4228208,-0.21573834
19,170.0,0.0,276.0,1.0,276.0,4398.0,0.0,3.0,20.0,0.0,,,,,,,,,,,,0.40178663,0.6728098000000001,2.4094729999999998,0.039225254,1.7645447,0.0042648454999999995,1.790634,1.7581539000000002,-0.00800227591624627,0.3189828889920542,1.685505986213684,-0.2216334342956543,1.1304478999999998,0.038905688,1.2024312,1.0620688999999999,0.0786909,0.1614253,0.58035713,0.012360237,-0.011329556999999999,0.4500582,1.2747773999999998,-0.24353555
20,191.0,0.0,420.0,1.0,420.0,4818.0,0.0,7.0,315.0,0.0,,,,,,,,,,,,0.36499518,0.3705313,1.1575621,0.04431529,1.7636738,0.004579304,1.7901049,1.7550941999999998,0.021820819079875944,0.3071232093063115,0.947787582874298,-0.2100141644477844,0.9154280999999999,0.09096202,1.1273426000000002,0.75912726,0.07179129,0.09326242,0.24904189999999998,0.0062419563,0.040330485,0.34031707,0.7956849,-0.22417025
21,197.0,0.0,104.0,1.0,104.0,4922.0,0.0,1.0,30.0,0.0,,,,,,,,,,,,0.5144350000000001,0.718648,1.9511981,0.12453204400000001,1.761423,0.0056095775,1.7902006999999998,1.7571856,0.05070284008979797,0.3168600697896934,0.925861954689026,-0.16295456886291504,0.88745534,0.012690918,0.93781537,0.8751633,0.06495949599999999,0.11300202,0.2909634,0.008293057,0.078482285,0.45015374,0.97864294,-0.15680452
22,203.0,0.0,113.0,1.0,113.0,5035.0,0.0,2.0,15.0,0.0,,,,,,,,,,,,0.64118826,0.86793905,2.3767917,0.18172713,1.7537029999999998,0.006172906,1.7896771,1.749237,0.08747740507125855,0.4437113729234238,1.7083609104156494,-0.18463540077209475,0.9569305,0.024823021,1.0054495,0.92341274,0.12694135,0.23306239999999998,0.5930651,0.009591491,0.15546985,0.65049607,1.4562124,-0.19140014
23,212.0,0.0,162.0,1.0,162.0,5197.0,0.0,1.0,5.0,0.0,,,,,,,,,,,,0.16368598,0.041053284,0.22597034,0.10290782,1.7606709999999999,0.004070151,1.7898455000000002,1.7559046999999999,-0.06407594718039036,0.13353017492122013,0.855490505695343,-0.17702490091323853,0.83308256,0.035228863,0.9194645,0.7807996,0.020336542,0.032399733,0.10603105,0.007155789,-0.11351252,0.08716212,0.11233470599999999,-0.18262672
24,216.0,0.0,69.0,1.0,69.0,5266.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.18022996,0.014585165,0.19933702,0.16394733,1.7644328999999999,0.005773647,1.7902433999999998,1.7603636000000003,-0.07377323607603709,0.04112314147691277,-0.0068422555923461905,-0.1422249674797058,0.7145154,0.026194045,0.7514143,0.6838747,0.006449179300000001,0.00046015754999999996,0.006775117,0.0057984180000000005,-0.12904127,0.008062066,-0.118956625,-0.13869014
25,238.0,0.0,432.0,1.0,432.0,5698.0,0.0,6.0,85.0,0.0,,,,,,,,,,,,0.33006436,0.35617912,1.3067303999999997,0.06890332,1.7717101999999998,0.004106822,1.7902195,1.7627416,0.07911585221687953,0.3249530771268533,0.9620689153671264,-0.13253813982009888,0.5261719,0.06930914,0.7088074000000001,0.44418199999999997,0.06108107,0.09490778,0.29277053,0.0023036576,0.140708,0.41082802,1.2361767,-0.12985662
1 Episode # Training Iter In Heatup ER #Transitions ER #Episodes Episode Length Total steps Epsilon Shaped Training Reward Training Reward Update Target Network Evaluation Reward Shaped Evaluation Reward Success Rate Loss/Mean Loss/Stdev Loss/Max Loss/Min Learning Rate/Mean Learning Rate/Stdev Learning Rate/Max Learning Rate/Min Grads (unclipped)/Mean Grads (unclipped)/Stdev Grads (unclipped)/Max Grads (unclipped)/Min Entropy/Mean Entropy/Stdev Entropy/Max Entropy/Min Advantages/Mean Advantages/Stdev Advantages/Max Advantages/Min Values/Mean Values/Stdev Values/Max Values/Min Value Loss/Mean Value Loss/Stdev Value Loss/Max Value Loss/Min Policy Loss/Mean Policy Loss/Stdev Policy Loss/Max Policy Loss/Min
5 4 0.0 1.0 187.0 1.0 187.0 646.0 0.0 0.0
6 5 0.0 1.0 86.0 1.0 86.0 732.0 0.0 0.0
7 6 0.0 1.0 331.0 1.0 331.0 1063.0 0.0 0.0
8 7 15.0 37.0 0.0 283.0 753.0 1.0 283.0 753.0 1346.0 1816.0 0.0 4.0 18.0 60.0 275.0 0.0 0.19038728 0.19276376 0.37413179999999996 0.24904153 1.085708 0.8257747 0.0005235527 0.00013455142 1.7916272 1.7914727 5.3056545e-05 0.00030057304 1.7917521 1.7917566000000003 1.7915139 1.7904335 0.7594709227977354 0.2119369695934143 0.7875483274351753 0.4029896256601249 2.654857397079468 1.8961015939712524 -0.01272570714354515 -0.038109242916107185 0.01685313 0.059407155999999996 0.026051456 0.06025871 0.06775641 0.20562454 -0.010952383999999999 -0.0059952493999999995 0.15764348 0.10879397 0.3329968 0.13543734 1.0256157 0.42884704 1.0381776e-06 1.0928144400000001e-07 0.39576545 0.37946862 0.7969487 0.4819447 2.2923334 1.3135536 -0.015413323999999999 -0.039071497000000004
9 8 28.0 43.0 0.0 260.0 107.0 1.0 260.0 107.0 1606.0 1923.0 0.0 1.0 0.0 5.0 0.0 0.0 0.04208376 0.015239561 0.116426505 0.0024996148 0.44519386 0.019516254 0.005088003 0.011843238999999998 1.7914166 1.7906417000000001 9.428363000000001e-05 0.0002779353 1.7916368000000003 1.7917081999999998 1.791266 1.790199 0.03226857285548017 -0.027609226256608964 0.1964720915841794 0.013555497071717059 0.985376238822937 -0.002493098378181457 -0.03431656211614609 -0.054812923073768616 0.10270161 0.23526458 0.014737441 0.013103152 0.123610094 0.25880286 0.014027433 0.2149402 0.02014357 0.0010658596000000002 0.06890331 4.0590476e-05 0.25883156 0.0011093322 0.00014790757 0.0010099161 0.050990395 -0.04934317 0.24849062 0.011472803 0.9112610999999999 -0.039747406 -0.04358855 -0.06998761
10 9 41.0 47.0 0.0 258.0 73.0 1.0 258.0 73.0 1864.0 1996.0 0.0 3.0 1.0 50.0 25.0 0.0 0.15317139 0.36924744 0.29834458 0.48345366 0.88614666 1.0529541999999998 0.0067990692000000005 0.027382427999999997 1.7912786 1.7904778000000001 0.00014624497 0.00029410556 1.7915285 1.7917029999999998 1.7909831000000003 1.7902383999999998 0.11791496871469112 0.2182698796192805 0.3335888662422885 0.4074832854303497 0.9964320659637452 0.9812830686569214 -0.11552873998880385 -0.05661928653717041 0.14228675 0.29025623 0.02709809 0.01401644 0.19294438 0.3092764 0.023462784 0.27106556 0.06762212 0.12317172 0.14083381 0.1714658 0.3970459 0.36566097 0.0002949245 0.0019109361999999999 0.21089181 0.39213285 0.5367635999999999 0.6270856 1.4676753999999999 1.2789414 -0.16865353 -0.0569774
11 10 60.0 0.0 379.0 251.0 1.0 379.0 251.0 2243.0 2247.0 0.0 2.0 4.0 15.0 30.0 0.0 0.13247547 0.43863640000000004 0.28383696 0.85959023 0.9851553 2.5217259999999997 0.017279061999999998 0.035312783 1.7896857 1.7856493000000002 0.0005040495 0.0015894786999999999 1.7909534 1.7916044999999998 1.7888476999999998 1.7811941999999998 0.03593501462428658 0.11047004585464797 0.25046052313640216 0.4511292058995869 0.9846256971359252 1.8077605962753296 -0.20018021762371066 -0.14487385749816895 0.30096403 0.61056226 0.05548007 0.07753438 0.4147089 0.79911625 0.039941877 0.4961661 0.039800237999999995 0.1471402 0.10806426400000001 0.31318584 0.35698512 0.8826119 0.0011615867999999999 0.0050534373 0.06293697 0.1947745 0.38049793 0.6780483 1.1996851 1.7730703 -0.2838259 -0.1461431
12 11 68.0 66.0 0.0 155.0 121.0 1.0 155.0 121.0 2398.0 2368.0 0.0 4.0 0.0 90.0 0.0 0.0 0.366822 0.14871785 0.27297008 0.02547766 0.85500246 0.19108213 0.056615636 0.12211705 1.7862365000000002 1.7780668000000002 0.0014941338 0.0027863213 1.7889153999999998 1.7913043000000002 1.7834325 1.7759546000000002 0.07147985940459389 -0.10607111632823944 0.3777481010142403 0.05572927975960492 0.9432406425476074 -0.009624600410461426 -0.4365565776824951 -0.19753050804138186 0.5416776 1.0220673 0.08939341 0.034096994 0.6384013000000001 1.0672632 0.08678746 0.95422196 0.120208606 0.021255326 0.1047605 0.0012313497 0.29189906 0.023416747999999998 0.005644179399999999 0.019953651 0.12151067 -0.18968049 0.40066546 0.016377756 0.74686474 -0.16716708 -0.6290473000000001 -0.21171494
13 12 72.0 71.0 0.0 63.0 99.0 1.0 63.0 99.0 2461.0 2467.0 0.0 0.0 0.0 0.0 0.380458 0.15555012 0.5025457 0.04610389 1.2508503 0.23511228 0.0845396 0.12260077 1.7842273000000002 1.7795043000000001 0.00016402099999999998 0.0023333197 1.7845758000000003 1.7912293999999997 1.7839825 1.7773554 -0.15644778052965802 -0.09951297342777253 0.19377771679160247 0.055938959522401716 -0.006263315677642822 -0.003077983856201172 -0.6633321642875671 -0.1911371946334839 0.66913104 1.0086769 0.10687629 0.036691166000000004 0.72362936 1.0868968 0.13550685 0.96416116 0.043720815 0.02108366 0.057323013 0.0023025707 0.143007 0.02351839 0.010515749 0.018744798 -0.31537065 -0.17643596 0.3354341 0.02231863 -0.11016506 -0.14677188 -0.8962283999999999 -0.20941082
14 13 92.0 98.0 0.0 395.0 534.0 1.0 395.0 534.0 2856.0 3001.0 0.0 7.0 13.0 80.0 340.0 0.0 0.7341813 0.7437658 0.8931361999999999 1.1363528999999999 3.4697776 3.4128056000000004 0.08773655 0.025902914 1.7756039000000001 1.7773186 0.007125674 0.003927723 1.7904316999999998 1.7911369 1.7572613999999998 1.7642651000000003 0.0531010530869581 0.1266731635882304 0.35287025941667083 0.41158095902457936 0.9526851177215576 1.7730363607406616 -0.43425774574279785 -0.30260801315307617 1.2777001000000001 0.9577929000000001 0.4850268 0.23765793 2.382248 1.6504846999999998 0.14852436 0.7122954 0.15513632 0.15221074 0.17681737 0.20607093 0.525007 0.757879 0.012881841000000002 0.008616385 0.07083849 0.22623166 0.47545427 0.6171821 1.1537104 1.5555726 -0.4574643 -0.30132666
15 14 99.0 102.0 0.0 136.0 73.0 1.0 136.0 73.0 2992.0 3074.0 0.0 3.0 0.0 35.0 0.0 0.0 2.3274047 0.21133159 3.6074872000000004 0.027693717000000003 11.039786 0.24578243 0.282328 0.17797336 1.7519287000000001 1.7719979 0.004642716 0.004124344 1.7601723999999999 1.7904103 1.7432127 1.7675488999999998 -0.3301314194997152 -0.18784045577049252 0.7044738896059562 0.09607193868630864 1.4541268348693848 -0.01820123195648193 -2.5556342601776123 -0.3348844051361084 2.4971607000000002 1.8157914000000002 0.26111710000000005 0.008718997 2.6487439999999998 1.8383793000000002 0.5862286999999999 1.7924569 0.53202635 0.056422543 0.7275212 0.0016911370000000001 2.252977 0.05787156 0.09076643 0.054050256 -0.6792965999999999 -0.33337316 1.2040575 0.013935573 0.35570467 -0.3163426 -3.540179 -0.35047740000000005
16 15 103.0 110.0 0.0 72.0 156.0 1.0 72.0 156.0 3064.0 3230.0 0.0 2.0 40.0 35.0 0.0 4.007548 0.62405765 3.65184 0.9592649999999999 10.018075 2.9572167 0.15519248 0.12969549 1.7376648999999997 1.7828823 0.0027116186 0.0024880429999999997 1.7455993000000003 1.7908226999999999 1.7320668000000001 1.7776691000000002 -0.2714320342791708 -0.06490173169544765 0.9907989894898622 0.3036005258432666 0.8220000267028809 1.5683243274688718 -2.4204037189483643 -0.3259851932525635 2.1503487 1.7211465 0.2635579 0.029713133 2.4204037 1.7807939999999998 0.5985670999999999 1.678612 0.77588123 0.12291474 0.7991889 0.17745444 2.1323647 0.5574187 0.06468646 0.04589054 -0.646171 -0.112290025 1.7660096999999997 0.43601707 0.74871886 0.9523574 -3.6078562999999995 -0.32423943
17 16 116.0 122.0 0.0 258.0 238.0 1.0 258.0 238.0 3322.0 3468.0 0.0 5.0 1.0 75.0 10.0 0.0 1.6523731 0.15554756 2.4698055 0.054277197 8.083523 0.29741687 0.13003878 0.08846552 1.7277107 1.7868587 0.0052154507 0.0007967119500000001 1.7423832 1.7910342000000001 1.7133006999999998 1.7857143 -0.039007020375085265 -0.1573395555669611 0.6065334628968307 0.12615047772660626 1.6601608991622925 0.7546746730804443 -2.028516292572021 -0.3632258176803589 2.0747602 1.6729587 0.20532979 0.17595315 2.3914117999999998 1.9944772000000002 0.51776755 1.502953 0.3199724 0.05232538 0.44089985 0.035234198 1.3816912000000001 0.15977861 0.045208815 0.03124919 -0.14023370000000002 -0.28081256 1.0421013000000001 0.079151474 1.9057945 -0.07227526599999999 -2.6204767 -0.35482806
18 17 126.0 132.0 0.0 187.0 185.0 1.0 187.0 185.0 3509.0 3653.0 0.0 0.0 0.0 0.0 0.92402875 0.08596161 2.2525914 0.024157293 7.679933500000001 0.13088742 0.10234933 0.04359092 1.6913862000000002 1.7865810000000002 0.00811164 0.0008728278999999999 1.7129078000000002 1.7914173999999998 1.6787053 1.7849907999999999 -0.3290705545382066 -0.1145426501830419 0.4870754997377705 0.06364297653654805 -0.00916290283203125 -0.008004307746887207 -2.0534813404083248 -0.26171672344207764 1.8546785 1.1001204 0.17576595 0.17392394 2.0647898000000002 1.4450287 0.58214456 0.87743527 0.20426781 0.016055183 0.49021757 0.005272877 1.6748041 0.027343987 0.0308303 0.009870462 -0.6092503 -0.20479403 0.87516916 0.033908524 -0.23161256 -0.15814927 -3.2324522 -0.26577490000000004
19 18 142.0 156.0 0.0 304.0 469.0 1.0 304.0 469.0 3813.0 4122.0 0.0 8.0 7.0 135.0 300.0 0.0 1.0899202 0.44079409999999997 1.2234477 0.6836350999999999 4.604091599999999 2.4870617000000004 0.09670285 0.025862668 1.6710208999999998 1.7806060000000001 0.017265387 0.0046418053 1.7212131000000002 1.7915317 1.6353879999999998 1.7732808999999998 0.038843880938396386 0.062453726452329876 0.4414781879982783 0.3429879814155853 1.6555359363555908 0.985063374042511 -1.5750807523727417 -0.20165133476257324 1.5963658 0.92002445 0.12633687 0.10120275599999999 1.7319156999999998 1.1328447 0.49549833 0.7651458999999999 0.20850252 0.091498 0.22034019 0.13105219999999998 0.7669623999999999 0.42509866 0.024377807999999997 0.007523017 -0.022250907 0.11129719 0.6641187 0.47185699999999997 1.1199272 1.4228208 -1.9804106 -0.21573834
20 19 154.0 170.0 0.0 229.0 276.0 1.0 229.0 276.0 4042.0 4398.0 0.0 3.0 30.0 20.0 0.0 1.2594553 0.40178663 1.3803116000000002 0.6728098000000001 3.5145763999999997 2.4094729999999998 0.16982433 0.039225254 1.5893492 1.7645447 0.020280475 0.0042648454999999995 1.7202836 1.790634 1.5471966000000001 1.7581539000000002 0.018158039344208586 -0.00800227591624627 0.4246180314783785 0.3189828889920542 0.9802966117858888 1.685505986213684 -1.4079413414001465 -0.2216334342956543 1.9309628 1.1304478999999998 0.13179472 0.038905688 2.00963 1.2024312 0.55298984 1.0620688999999999 0.16962524 0.0786909 0.19235095 0.1614253 0.5845739000000001 0.58035713 0.036804248 0.012360237 -0.12052882 -0.011329556999999999 0.73935986 0.4500582 1.0234607 1.2747773999999998 -1.7943251 -0.24353555
21 20 159.0 191.0 0.0 94.0 420.0 1.0 94.0 420.0 4136.0 4818.0 0.0 2.0 7.0 45.0 315.0 0.0 2.433933 0.36499518 2.4023757000000003 0.3705313 6.9298470000000005 1.1575621 0.361035 0.04431529 1.5338767 1.7636738 0.009024621 0.004579304 1.5852808 1.7901049 1.5249511 1.7550941999999998 -0.07727590973457593 0.021820819079875944 0.6763922700521872 0.3071232093063115 0.9142348766326904 0.947787582874298 -2.0492594242095947 -0.2100141644477844 1.9556553 0.9154280999999999 0.17559198 0.09096202 2.0944297 1.1273426000000002 0.6989957 0.75912726 0.43858927 0.07179129 0.53540176 0.09326242 1.4796344 0.24904189999999998 0.038060218 0.0062419563 -0.40701088 0.040330485 1.2666613000000002 0.34031707 0.9898828000000002 0.7956849 -2.7505924999999998 -0.22417025
22 21 167.0 197.0 0.0 160.0 104.0 1.0 160.0 104.0 4296.0 4922.0 0.0 1.0 30.0 0.0 1.1918872999999999 0.5144350000000001 2.3961012000000004 0.718648 7.528841000000001 1.9511981 0.21041411 0.12453204400000001 1.5789443 1.761423 0.032899573 0.0056095775 1.6285396 1.7902006999999998 1.5285223999999997 1.7571856 -0.3033659016931212 0.05070284008979797 0.4753093769161661 0.3168600697896934 0.6735560894012451 0.925861954689026 -1.957652568817139 -0.16295456886291504 1.7269223999999999 0.88745534 0.17983046 0.012690918 1.9576526 0.93781537 0.65285933 0.8751633 0.23161101 0.06495949599999999 0.49035895 0.11300202 1.5254883 0.2909634 0.02308034 0.008293057 -0.5342293 0.078482285 0.74790174 0.45015374 -0.17959394 0.97864294 -2.5103128 -0.15680452
23 22 177.0 203.0 0.0 196.0 113.0 1.0 196.0 113.0 4492.0 5035.0 0.0 6.0 2.0 50.0 15.0 0.0 1.9176056000000001 0.64118826 1.4854782 0.86793905 5.182706 2.3767917 0.1296832 0.18172713 1.600741 1.7537029999999998 0.007108025699999999 0.006172906 1.6213428 1.7896771 1.5850583 1.749237 0.13868192747235294 0.08747740507125855 0.6509546583268273 0.4437113729234238 0.9652938842773438 1.7083609104156494 -1.4699276685714722 -0.18463540077209475 1.5411451999999999 0.9569305 0.12013785 0.024823021 1.6934890999999999 1.0054495 0.48539475 0.92341274 0.29114074 0.12694135 0.25261027 0.23306239999999998 0.9247953999999999 0.5930651 0.02226338 0.009591491 0.20969613 0.15546985 0.9626545 0.65049607 1.3314689 1.4562124 -2.1836848 -0.19140014
24 23 181.0 212.0 0.0 64.0 162.0 1.0 64.0 162.0 4556.0 5197.0 0.0 2.0 1.0 15.0 5.0 0.0 1.7134898 0.16368598 1.7741624 0.041053284 4.7404027 0.22597034 0.31171557 0.10290782 1.5784401000000001 1.7606709999999999 0.0042786077 0.004070151 1.5924774 1.7898455000000002 1.5698988 1.7559046999999999 -0.2495364295808893 -0.06407594718039036 0.7244287028836092 0.13353017492122013 0.8197289705276489 0.855490505695343 -1.6387592554092407 -0.17702490091323853 1.5626915000000001 0.83308256 0.14662078 0.035228863 1.6555089 0.9194645 0.5695728 0.7807996 0.39968159999999997 0.020336542 0.42479333 0.032399733 1.1206775 0.10603105 0.023818979 0.007155789 -0.60248417 -0.11351252 1.3273264 0.08716212 0.47739142 0.11233470599999999 -2.8594353 -0.18262672
25 24 190.0 216.0 0.0 173.0 69.0 1.0 173.0 69.0 4729.0 5266.0 0.0 2.0 0.0 35.0 0.0 0.0 0.86420363 0.18022996 1.1505579 0.014585165 3.1139815 0.19933702 0.21101475 0.16394733 1.5583048000000002 1.7644328999999999 0.017154397 0.005773647 1.6805171000000003 1.7902433999999998 1.543207 1.7603636000000003 -0.04846956090229314 -0.07377323607603709 0.3673147151981751 0.04112314147691277 1.5572770833969116 -0.0068422555923461905 -1.3467092514038086 -0.1422249674797058 1.4270504 0.7145154 0.10279469999999999 0.026194045 1.5661808000000002 0.7514143 0.55668885 0.6838747 0.13428058 0.006449179300000001 0.21433474 0.00046015754999999996 0.55084944 0.006775117 0.018585313 0.0057984180000000005 -0.23329782 -0.12904127 0.7188769 0.008062066 1.2898462 -0.118956625 -1.7598825 -0.13869014
26 25 205.0 238.0 0.0 287.0 432.0 1.0 287.0 432.0 5016.0 5698.0 0.0 1.0 6.0 15.0 85.0 0.0 0.62027776 0.33006436 1.1781626 0.35617912 5.0210605 1.3067303999999997 0.19690849 0.06890332 1.514342 1.7717101999999998 0.03201414 0.004106822 1.5717763 1.7902195 1.4700277 1.7627416 -0.17472102851591012 0.07911585221687953 0.2584962285741155 0.3249530771268533 0.7259573936462402 0.9620689153671264 -1.389548659324646 -0.13253813982009888 1.372163 0.5261719 0.19543384 0.06930914 1.6445271000000001 0.7088074000000001 0.49026695 0.44418199999999997 0.07691100000000001 0.06108107 0.1892716 0.09490778 0.77725166 0.29277053 0.012039252 0.0023036576 -0.2952007 0.140708 0.3808548 0.41082802 -0.022987355 1.2361767 -1.7028612 -0.12985662
26 213.0 0.0 148.0 1.0 148.0 5164.0 0.0 1.0 10.0 0.0 0.87788093 1.1205411 3.2541595 0.16302086 1.6190518999999999 0.027872879 1.6517131 1.5822551 -0.02111946441689316 0.3623434091799364 0.9680911898612976 -1.0958757400512695 0.85739833 0.11593313 1.0958757 0.43242526 0.10100008 0.15961096 0.4262519 0.0066352487 -0.1389226 0.54876155 0.983246 -1.2080108999999999
27 226.0 0.0 252.0 1.0 252.0 5416.0 0.0 7.0 80.0 0.0 0.5659744000000001 0.5525852 1.7838180000000001 0.114572525 1.6545971999999998 0.01972044 1.6865756999999997 1.6199006 0.12287720557182065 0.4328559747238113 1.8018977642059328 -0.7886770963668823 0.6744036999999999 0.0695462 0.78889847 0.3162281 0.11731172 0.14342302 0.47155509999999995 0.0039760494 0.15986028 0.57722294 1.304932 -1.0839018999999999
28 235.0 0.0 172.0 1.0 172.0 5588.0 0.0 3.0 30.0 0.0 0.5754373 0.53583986 1.652655 0.20605606 1.6293853999999999 0.009444112 1.6664357 1.6154865 0.022745465989722758 0.37979156365269456 0.9582107663154602 -0.7535422444343567 0.760279 0.04507102 0.8029374 0.30640987 0.08912065 0.09862131 0.26244953 0.007173898000000001 -0.031205913 0.53349096 0.8757472 -1.2156468999999999
29 247.0 0.0 222.0 1.0 222.0 5810.0 0.0 5.0 50.0 0.0 0.58918446 0.6646291999999999 2.1683297 0.1408544 1.6602775 0.023379487999999997 1.6966393 1.6148347 0.06360590496453745 0.4046164354742885 1.7265342473983765 -0.8571033477783203 0.60714984 0.087587655 0.85710335 0.35827366 0.09666201 0.123577625 0.3676509 0.00446005 0.06519471 0.5444627 1.2408435 -1.1376678
30 257.0 0.0 194.0 1.0 194.0 6004.0 0.0 2.0 40.0 0.0 0.39171672 0.37535542 1.2922283 0.14345266 1.6632646 0.020220451 1.7691753000000001 1.6358597 0.057352830338609086 0.3057737359186611 0.9528021216392516 -0.3989834487438202 0.5558522 0.04560928 0.62815917 0.24490662 0.05019971 0.07826413 0.22374696 0.0043086787 0.024740081 0.38840258 0.7919935 -0.59568876