1
0
mirror of https://github.com/gryf/coach.git synced 2026-02-18 23:45:48 +01:00
Files
coach/rl_coach/traces/Atari_C51_pong/trace.csv
Itai Caspi 72a1d9d426 Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
2018-09-04 15:07:54 +03:00

1.8 KiB

1Episode #Training IterIn HeatupER #TransitionsER #EpisodesEpisode LengthTotal stepsEpsilonShaped Training RewardTraining RewardUpdate Target NetworkEvaluation RewardShaped Evaluation RewardSuccess RateLoss/MeanLoss/StdevLoss/MaxLoss/MinLearning Rate/MeanLearning Rate/StdevLearning Rate/MaxLearning Rate/MinGrads (unclipped)/MeanGrads (unclipped)/StdevGrads (unclipped)/MaxGrads (unclipped)/MinQ/MeanQ/StdevQ/MaxQ/Min
210.01.01117.01117.01117.01117.01.00.0
32205.00.01937.01937.0820.01937.00.9991882000000176-21.0-21.00.03.93021564483642560.00104968464403890273.9315538406372073.92674231529235840.00025000000000000011.0842021724855042e-190.000250.000250.00217353250.00239755470.0155460120.0008601941499999999
43413.00.02768.02768.0831.02768.00.9983655100000356-21.0-21.00.03.92873877974656870.00107255368756685843.9300544261932373.922055482864380.00025000000000000011.0842021724855042e-190.000250.000250.00143525810.00227751190.0166612830.00054555150.061435242804388770.0108332955391362350.07301896996796190.04586568772792873
54667.00.03783.03783.01015.03783.00.9973606600000572-20.0-20.00.03.92818758900710.00092673139046969123.92920494079589753.92524409294128370.00025000000000000015.421010862427521e-200.000250.000250.00128797730.00257535880.0186264660.000304934450.063625358045101760.00530058735674619750.068917750939727460.053885202482343325
65892.00.04684.04684.0901.04684.00.9964686700000768-20.0-20.00.03.92805505328708150.00097073942318596323.92898178100585943.92410182952880860.00025000000000000011.0842021724855042e-190.000250.000250.000885817460.00175677170.0164090540.000221215349999999990.063595397615184960.0053752928116069720.072930130735040260.05364551693201117