mirror of
https://github.com/gryf/coach.git
synced 2026-04-19 06:03:32 +02:00
Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
Episode #,Training Iter,In Heatup,ER #Transitions,ER #Episodes,Episode Length,Total steps,Epsilon,Shaped Training Reward,Training Reward,Update Target Network,Evaluation Reward,Shaped Evaluation Reward,Success Rate,Loss/Mean,Loss/Stdev,Loss/Max,Loss/Min,Learning Rate/Mean,Learning Rate/Stdev,Learning Rate/Max,Learning Rate/Min,Grads (unclipped)/Mean,Grads (unclipped)/Stdev,Grads (unclipped)/Max,Grads (unclipped)/Min,Q/Mean,Q/Stdev,Q/Max,Q/Min
|
||||
1,0.0,1.0,1117.0,1117.0,1117.0,1117.0,1.0,,,0.0,,,,,,,,,,,,,,,,,,,
|
||||
2,210.0,0.0,1958.0,1958.0,841.0,1958.0,0.9992431000000248,-20.0,-20.0,0.0,,,,0.011158549779723952,0.01233800718156463,0.04892086610198021,7.747895870124921e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.07765737,0.051450502,0.27204409999999996,0.016480377,,,,
|
||||
3,402.0,0.0,2726.0,2726.0,768.0,2726.0,0.9985519000000476,-21.0,-21.0,0.0,,,,0.011682878495226607,0.013976986698806206,0.07550939172506332,3.554971408448182e-05,0.00010000000000000003,2.7105054312137605e-20,0.0001,0.0001,0.054967567,0.03760215,0.23677647,0.007137654300000001,0.059924055,0.010001821999999999,0.070588365,0.045257278
|
||||
4,601.0,0.0,3519.0,3519.0,793.0,3519.0,0.9978382000000712,-21.0,-21.0,0.0,,,,0.013331305195076387,0.013162853602752194,0.0471726730465889,9.195879101753236e-05,0.0001,0.0,0.0001,0.0001,0.05391158,0.02641614,0.14699543,0.017958568,0.038910400000000005,0.006119223000000001,0.046009037999999995,0.030036567000000004
|
||||
5,837.0,0.0,4466.0,4466.0,947.0,4466.0,0.9969859000000992,-20.0,-20.0,0.0,,,,0.011204646104627085,0.012869155071181351,0.06053701043128967,6.284505798248574e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.047131248,0.026914247999999998,0.13275696,0.010900318999999999,,,,
|
||||
2,205.0,0.0,1937.0,1937.0,820.0,1937.0,0.9992620000000244,-21.0,-21.0,0.0,,,,0.011010780938079,0.013098460400306485,0.06118807196617127,6.86898929416202e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.08733994,0.06833449,0.47135752,0.016372742,,,,
|
||||
3,413.0,0.0,2768.0,2768.0,831.0,2768.0,0.9985141000000488,-21.0,-21.0,0.0,,,,0.01163802880151147,0.013571124716079436,0.08714678883552551,3.9931001083459705e-05,0.00010000000000000003,2.7105054312137605e-20,0.0001,0.0001,0.06724033,0.035371285,0.2241408,0.011829718999999999,0.10583201,0.011610512,0.12072124,0.08555735
|
||||
4,667.0,0.0,3783.0,3783.0,1015.0,3783.0,0.9976006000000791,-20.0,-20.0,0.0,,,,0.01136319609350886,0.012043113812065086,0.049625951796770096,9.354137000627816e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.060902383,0.032815605,0.17838788,0.015925674,0.0978057,0.014090337,0.123560354,0.07580207
|
||||
5,947.0,0.0,4906.0,4906.0,1123.0,4906.0,0.9965899000001124,-18.0,-18.0,0.0,,,,0.010341535720908724,0.011934284708938809,0.06498207896947861,6.708659930154681e-05,0.00010000000000000002,1.3552527156068802e-20,0.0001,0.0001,0.054970358,0.03215441,0.26232755,0.009252935,0.09154041,0.009532932,0.10656521,0.07300271
|
||||
|
||||
|
Reference in New Issue
Block a user