1
0
mirror of https://github.com/gryf/coach.git synced 2026-01-30 04:05:51 +01:00
Files
coach/rl_coach/traces/Atari_NStepQ_pong/trace.csv
Itai Caspi 72a1d9d426 Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
2018-09-04 15:07:54 +03:00

1.5 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinEntropy/MeanEntropy/StdevEntropy/MaxEntropy/MinQ/MeanQ/StdevQ/MaxQ/MinQ Values/MeanQ Values/StdevQ Values/MaxQ Values/MinValue Loss/MeanValue Loss/StdevValue Loss/MaxValue Loss/Min
210.01.01117.01.01117.01117.00.50.0
32166.00.0834.01.0834.01951.00.4918267999999965-20.0-20.00.0-0.0493091420.059554260.11067552-0.313852730.109652260.257791340.960198161.650419e-05
43343.00.0883.01.0883.02834.00.4831733999999927-20.0-20.00.00.000396122370.0221378280.047817677-0.0579337660.054497060.150114120.86705720.0013089271000000001
54495.00.0759.01.0759.03593.00.4757351999999895-21.0-21.00.0-0.0131075450.0147925510000000010.02346693-0.0519092050.096063850.229369180.841315150.00357637
65646.00.0755.01.0755.04348.00.4683361999999863-21.0-21.00.0-0.0562910250.0241216379999999970.011681341000000001-0.117412450.1119647850.249550770.791201650.0038064622999999997