mirror of
https://github.com/gryf/coach.git
synced 2026-04-20 15:11:24 +02:00
Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min
|
||||
1,0.0,1.0,1117.0,1117.0,1117.0,1117.0,1.0,,,0.0,,,,,,,,,,,,,,,,,,,
|
||||
2,221.0,0.0,2002.0,2002.0,885.0,2002.0,0.999123850000019,-21.0,-21.0,0.0,,,,0.006624795104714366,0.00394576811971849,0.01863841339945793,6.383289291989058e-05,6.250000000000003e-05,2.7105054312137605e-20,6.25e-05,6.25e-05,0.032127135,0.014603343000000001,0.12838697,0.005512589,,,,
|
||||
3,455.0,0.0,2938.0,2938.0,936.0,2938.0,0.9981972100000392,-20.0,-20.0,0.0,,,,0.006993958523544746,0.0031627418936934102,0.01826494000852108,0.000633664894849062,6.250000000000003e-05,2.7105054312137605e-20,6.25e-05,6.25e-05,0.026382675,0.010049541,0.06018944,0.009578557,-0.08102258,0.054663535,-0.0028564844,-0.15667786
|
||||
4,659.0,0.0,3754.0,3754.0,816.0,3754.0,0.9973893700000568,-21.0,-21.0,0.0,,,,0.00653242061713065,0.0030014368076197325,0.014597361907362938,3.4910688555100926e-05,6.250000000000001e-05,1.3552527156068802e-20,6.25e-05,6.25e-05,0.019908648,0.0060336159999999995,0.03786578,0.003926692,,,,
|
||||
5,906.0,0.0,4739.0,4739.0,985.0,4739.0,0.9964142200000778,-20.0,-20.0,0.0,,,,0.005325366398493989,0.00258031872854336,0.01823988556861877,6.391682836692779e-05,6.250000000000003e-05,2.7105054312137605e-20,6.25e-05,6.25e-05,0.016708475,0.006444646,0.051227405999999996,0.0036940586,-0.042256642000000004,0.010646114,-0.030611286,-0.06712968
|
||||
2,197.0,0.0,1905.0,1905.0,788.0,1905.0,0.9992198800000168,-21.0,-21.0,0.0,,,,0.0065035605150511894,0.004365216942868011,0.04185768589377403,1.6300582501571625e-05,6.250000000000001e-05,1.3552527156068802e-20,6.25e-05,6.25e-05,0.04899958,0.035690054,0.465425,0.0031771401,,,,
|
||||
3,436.0,0.0,2862.0,2862.0,957.0,2862.0,0.9982724500000376,-20.0,-20.0,0.0,,,,0.006882304690776307,0.0032755384482328074,0.018768906593322757,0.00028316525276750326,6.250000000000003e-05,2.7105054312137605e-20,6.25e-05,6.25e-05,0.037334877999999995,0.016123397,0.11000781,0.007852386,-0.25035575,0.057181817,-0.1695276,-0.34914327
|
||||
4,627.0,0.0,3623.0,3623.0,761.0,3623.0,0.997519060000054,-21.0,-21.0,0.0,,,,0.004881470595769075,0.0024654802506201947,0.01351994462311268,3.340750481584109e-05,6.250000000000001e-05,1.3552527156068802e-20,6.25e-05,6.25e-05,0.028977735,0.016445445,0.09510474,0.0037849140000000003,,,,
|
||||
5,855.0,0.0,4535.0,4535.0,912.0,4535.0,0.9966161800000736,-20.0,-20.0,0.0,,,,0.004249975731765612,0.0017149519969122415,0.01000758446753025,5.5568867537658655e-05,6.250000000000003e-05,2.7105054312137605e-20,6.25e-05,6.25e-05,0.020409843,0.013720203,0.084716946,0.005521884,-0.11609744,0.011784006000000001,-0.10053374,-0.13682899
|
||||
|
||||
|
Reference in New Issue
Block a user