mirror of
https://github.com/gryf/coach.git
synced 2026-04-18 21:53:32 +02:00
Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min
|
||||
1,0.0,1.0,1117.0,1117.0,1117.0,1117.0,1.0,,,0.0,,,,,,,,,,,,,,,,,,,
|
||||
2,221.0,0.0,2002.0,2002.0,885.0,2002.0,0.9992035000000262,-21.0,-21.0,0.0,,,,0.0066113567236284546,0.003946234120878863,0.016941886395215988,3.0340672310558148e-05,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.020578874,0.011285608,0.12838697,0.003849274,,,,
|
||||
3,455.0,0.0,2938.0,2938.0,936.0,2938.0,0.9983611000000541,-20.0,-20.0,0.0,,,,0.007220610483817191,0.00384386883256313,0.02201320417225361,0.0004259901470504701,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.014196658000000001,0.0053113990000000005,0.040406343,0.005419724599999999,-0.012426703,0.021457887999999998,0.023741005,-0.051037904
|
||||
4,659.0,0.0,3754.0,3754.0,816.0,3754.0,0.997626700000078,-21.0,-21.0,0.0,,,,0.007067595686713306,0.00349683739085928,0.016786431893706318,0.0004974190378561616,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.012732236999999999,0.0038257977000000004,0.02420173,0.00600734,,,,
|
||||
5,961.0,0.0,4961.0,4961.0,1207.0,4961.0,0.996540400000114,-18.0,-18.0,0.0,,,,0.007034662726550326,0.003637364351878082,0.022078890353441242,0.0004736386181320995,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.012767965,0.0043293815,0.031500462,0.0063609104,-0.01521363,0.011859578,0.006441065,-0.04179345
|
||||
2,197.0,0.0,1905.0,1905.0,788.0,1905.0,0.9992908000000232,-21.0,-21.0,0.0,,,,0.0051924175274927565,0.003918679938872439,0.04185768589377403,2.9565440854639746e-05,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.01784605,0.03255357,0.465425,0.0038900522,,,,
|
||||
3,436.0,0.0,2862.0,2862.0,957.0,2862.0,0.9984295000000516,-20.0,-20.0,0.0,,,,0.004909432677758631,0.0024521858486776424,0.012306905351579191,0.00032079339143820107,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.0113589475,0.0037933819,0.025680352000000004,0.0035025426,0.030741736000000002,0.025549445,0.07848698,-0.02225282
|
||||
4,627.0,0.0,3623.0,3623.0,761.0,3623.0,0.9977446000000744,-21.0,-21.0,0.0,,,,0.0052940571797080345,0.002501595309474277,0.012016894295811651,0.0003992373822256922,0.0002500000000000001,5.421010862427521e-20,0.00025,0.00025,0.010990373999999999,0.0038335419,0.027035048,0.005245461,,,,
|
||||
5,855.0,0.0,4535.0,4535.0,912.0,4535.0,0.9969238000001012,-20.0,-20.0,0.0,,,,0.004946799854224082,0.0024341152117377785,0.013126095756888391,0.0003701391979120672,0.0002500000000000001,1.0842021724855042e-19,0.00025,0.00025,0.010130615,0.0032620803,0.022317264,0.0045093056,0.026840469,0.01787639,0.051877695999999994,-0.005629579
|
||||
|
||||
|
Reference in New Issue
Block a user