mirror of
https://github.com/gryf/coach.git
synced 2026-04-17 21:03:32 +02:00
Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min
|
||||
1,0.0,1.0,1117.0,1117.0,1117.0,1117.0,1.0,,,0.0,,,,,,,,,,,,,,,,,,,
|
||||
2,210.0,0.0,1958.0,1958.0,841.0,1958.0,0.999167410000018,-20.0,-20.0,0.0,,,,0.011756908099604993,0.01245646310720048,0.05387234315276146,0.00010689756891224532,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.057962038,0.04616896400000001,0.26208854,0.0071766186,,,,
|
||||
3,402.0,0.0,2726.0,2726.0,768.0,2726.0,0.9984070900000346,-21.0,-21.0,0.0,,,,0.012809355009267165,0.013771132011321113,0.07975033670663834,5.99101695115678e-05,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.052051324,0.028359309,0.17658195,0.008862591,-0.017426128,0.0060299635,-0.008042792,-0.026319288
|
||||
4,601.0,0.0,3519.0,3519.0,793.0,3519.0,0.9976220200000516,-21.0,-21.0,0.0,,,,0.015272312543569037,0.013672084153799915,0.05628284066915512,0.00023415754549205303,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.052314125,0.023336997,0.1473458,0.012913031,-0.031559315,0.0042713494,-0.023393027,-0.036418874
|
||||
5,809.0,0.0,4352.0,4352.0,833.0,4352.0,0.9967973500000696,-21.0,-21.0,0.0,,,,0.013082799735107424,0.01255374334846098,0.06567259877920151,0.0004701522993855178,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.043265857000000005,0.014534917,0.089655906,0.017195849,-0.0053307074,0.0027605025,-0.0019208845999999999,-0.00974094
|
||||
2,205.0,0.0,1937.0,1937.0,820.0,1937.0,0.9991882000000176,-21.0,-21.0,0.0,,,,0.013271789207150194,0.014381215654183937,0.08661144971847534,7.284892490133643e-05,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.09793413,0.109029554,1.2459028,0.010081228000000001,,,,
|
||||
3,413.0,0.0,2768.0,2768.0,831.0,2768.0,0.9983655100000356,-21.0,-21.0,0.0,,,,0.013095782662258687,0.014563835652836424,0.09017306566238403,4.85398450109642e-05,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.06699568,0.10204898,0.9738844000000001,0.005621953000000001,-0.06337769,0.006071376999999999,-0.05691424,-0.07540042
|
||||
4,667.0,0.0,3783.0,3783.0,1015.0,3783.0,0.9973606600000572,-20.0,-20.0,0.0,,,,0.014243900448040163,0.012460161619208224,0.05600857362151146,8.375291145057417e-06,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.08014218,0.05026457,0.24418142,0.0018464670999999999,-0.08484802400000001,0.007937772,-0.07532068,-0.09821871
|
||||
5,867.0,0.0,4585.0,4585.0,802.0,4585.0,0.9965666800000744,-21.0,-21.0,0.0,,,,0.0149451127843804,0.012661744241431476,0.057885006070137024,2.08603323699208e-05,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.084665276,0.07432766,0.39534,0.0034519034000000002,-0.09767585,0.029707237999999997,-0.061746947,-0.13731477
|
||||
|
||||
|
Reference in New Issue
Block a user