1
0
mirror of https://github.com/gryf/coach.git synced 2026-02-22 01:45:56 +01:00
Files
coach/rl_coach/traces/Atari_Bootstrapped_DQN_pong/trace.csv
Itai Caspi 72a1d9d426 Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
2018-09-04 15:07:54 +03:00

1.4 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinQ/MeanQ/StdevQ/MaxQ/Min
210.01.0986.0986.0986.0986.07.00.0
320.01.01806.01806.0820.01806.04.00.0
43206.00.02629.02629.0823.02629.05.0-21.0-21.00.00.013756274326774520.0135053308398938080.066774450242519380.00055532209808006880.00025000000000000011.0842021724855042e-190.000250.000250.0136027380.00489167260.0342451040.0056978124
54398.00.03397.03397.0768.03397.03.0-21.0-21.00.00.0141566103678390680.0131733633509603340.0591197274625301360.00070800463436171410.00025000000000000015.421010862427521e-200.000250.000250.0128397989999999990.00384169190.0244801360.005681609000000001
65617.00.04274.04274.0877.04274.06.0-21.0-21.00.00.0153691394846741810.014632294843292470.081136159598827360.00054876285139471290.00025000000000000011.0842021724855042e-190.000250.000250.0142496320.0059018395999999990.040927610.0048814370.0040084280.0164760480.028364737-0.026583625