1
0
mirror of https://github.com/gryf/coach.git synced 2026-03-04 07:45:53 +01:00
Files
coach/rl_coach/traces/Atari_DDQN_with_PER_pong/trace.csv
Itai Caspi 72a1d9d426 Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
2018-09-04 15:07:54 +03:00

1.7 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinQ/MeanQ/StdevQ/MaxQ/Min
210.01.01117.01117.01117.01117.01.00.0
32197.00.01905.01905.0788.01905.00.9992198800000168-21.0-21.00.00.00650356051505118940.0043652169428680110.041857685893774031.6300582501571625e-056.250000000000001e-051.3552527156068802e-206.25e-056.25e-050.048999580.0356900540.4654250.0031771401
43436.00.02862.02862.0957.02862.00.9982724500000376-20.0-20.00.00.0068823046907763070.00327553844823280740.0187689065933227570.000283165252767503266.250000000000003e-052.7105054312137605e-206.25e-056.25e-050.0373348779999999950.0161233970.110007810.007852386-0.250355750.057181817-0.1695276-0.34914327
54627.00.03623.03623.0761.03623.00.997519060000054-21.0-21.00.00.0048814705957690750.00246548025062019470.013519944623112683.340750481584109e-056.250000000000001e-051.3552527156068802e-206.25e-056.25e-050.0289777350.0164454450.095104740.0037849140000000003
65855.00.04535.04535.0912.04535.00.9966161800000736-20.0-20.00.00.0042499757317656120.00171495199691224150.010007584467530255.5568867537658655e-056.250000000000003e-052.7105054312137605e-206.25e-056.25e-050.0204098430.0137202030.0847169460.005521884-0.116097440.011784006000000001-0.10053374-0.13682899