mirror of
https://github.com/gryf/coach.git
synced 2026-02-17 14:45:50 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
7 lines
1.5 KiB
CSV
7 lines
1.5 KiB
CSV
Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Entropy/Mean,Entropy/Stdev,Entropy/Max,Entropy/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min,Q Values/Mean,Q Values/Stdev,Q Values/Max,Q Values/Min,Value Loss/Mean,Value Loss/Stdev,Value Loss/Max,Value Loss/Min
|
|
1,0.0,1.0,1117.0,1.0,1117.0,1117.0,0.5,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
|
|
2,166.0,0.0,834.0,1.0,834.0,1951.0,0.4918267999999965,-20.0,-20.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,-0.049309142,0.05955426,0.11067552,-0.31385273,0.10965226,0.25779134,0.96019816,1.650419e-05
|
|
3,343.0,0.0,883.0,1.0,883.0,2834.0,0.4831733999999927,-20.0,-20.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,0.00039612237,0.022137828,0.047817677,-0.057933766,0.05449706,0.15011412,0.8670572,0.0013089271000000001
|
|
4,495.0,0.0,759.0,1.0,759.0,3593.0,0.4757351999999895,-21.0,-21.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,-0.013107545,0.014792551000000001,0.02346693,-0.051909205,0.09606385,0.22936918,0.84131515,0.00357637
|
|
5,646.0,0.0,755.0,1.0,755.0,4348.0,0.4683361999999863,-21.0,-21.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,-0.056291025,0.024121637999999997,0.011681341000000001,-0.11741245,0.111964785,0.24955077,0.79120165,0.0038064622999999997
|