1
0
mirror of https://github.com/gryf/coach.git synced 2026-03-04 15:55:47 +01:00
Files
coach/rl_coach/traces/Atari_DQN_with_PER_pong/trace.csv
Itai Caspi 72a1d9d426 Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
2018-09-04 15:07:54 +03:00

1.7 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinQ/MeanQ/StdevQ/MaxQ/Min
210.01.01117.01117.01117.01117.01.00.0
32197.00.01905.01905.0788.01905.00.9992908000000232-21.0-21.00.00.00519241752749275650.0039186799388724390.041857685893774032.9565440854639746e-050.00025000000000000015.421010862427521e-200.000250.000250.017846050.032553570.4654250.0038900522
43436.00.02862.02862.0957.02862.00.9984295000000516-20.0-20.00.00.0049094326777586310.00245218584867764240.0123069053515791910.000320793391438201070.00025000000000000011.0842021724855042e-190.000250.000250.01135894750.00379338190.0256803520000000040.00350254260.0307417360000000020.0255494450.07848698-0.02225282
54627.00.03623.03623.0761.03623.00.9977446000000744-21.0-21.00.00.00529405717970803450.0025015953094742770.0120168942958116510.00039923738222569220.00025000000000000015.421010862427521e-200.000250.000250.0109903739999999990.00383354190.0270350480.005245461
65855.00.04535.04535.0912.04535.00.9969238000001012-20.0-20.00.00.0049467998542240820.00243411521173777850.0131260957568883910.00037013919791206720.00025000000000000011.0842021724855042e-190.000250.000250.0101306150.00326208030.0223172640.00450930560.0268404690.017876390.051877695999999994-0.005629579