mirror of
https://github.com/gryf/coach.git
synced 2026-02-18 23:45:48 +01:00
* reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file
1.8 KiB
1.8 KiB
| 1 | Episode # | Training Iter | In Heatup | ER #Transitions | ER #Episodes | Episode Length | Total steps | Epsilon | Shaped Training Reward | Training Reward | Update Target Network | Evaluation Reward | Shaped Evaluation Reward | Success Rate | Loss/Mean | Loss/Stdev | Loss/Max | Loss/Min | Learning Rate/Mean | Learning Rate/Stdev | Learning Rate/Max | Learning Rate/Min | Grads (unclipped)/Mean | Grads (unclipped)/Stdev | Grads (unclipped)/Max | Grads (unclipped)/Min | Q/Mean | Q/Stdev | Q/Max | Q/Min |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 1 | 0.0 | 1.0 | 1117.0 | 1117.0 | 1117.0 | 1117.0 | 1.0 | 0.0 | |||||||||||||||||||||
| 3 | 2 | 205.0 | 0.0 | 1937.0 | 1937.0 | 820.0 | 1937.0 | 0.9991882000000176 | -21.0 | -21.0 | 0.0 | 3.9302156448364256 | 0.0010496846440389027 | 3.931553840637207 | 3.9267423152923584 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.0021735325 | 0.0023975547 | 0.015546012 | 0.0008601941499999999 | |||||||
| 4 | 3 | 413.0 | 0.0 | 2768.0 | 2768.0 | 831.0 | 2768.0 | 0.9983655100000356 | -21.0 | -21.0 | 0.0 | 3.9287387797465687 | 0.0010725536875668584 | 3.930054426193237 | 3.92205548286438 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.0014352581 | 0.0022775119 | 0.016661283 | 0.0005455515 | 0.06143524280438877 | 0.010833295539136235 | 0.0730189699679619 | 0.04586568772792873 | |||
| 5 | 4 | 667.0 | 0.0 | 3783.0 | 3783.0 | 1015.0 | 3783.0 | 0.9973606600000572 | -20.0 | -20.0 | 0.0 | 3.9281875890071 | 0.0009267313904696912 | 3.9292049407958975 | 3.9252440929412837 | 0.0002500000000000001 | 5.421010862427521e-20 | 0.00025 | 0.00025 | 0.0012879773 | 0.0025753588 | 0.018626466 | 0.00030493445 | 0.06362535804510176 | 0.0053005873567461975 | 0.06891775093972746 | 0.053885202482343325 | |||
| 6 | 5 | 892.0 | 0.0 | 4684.0 | 4684.0 | 901.0 | 4684.0 | 0.9964686700000768 | -20.0 | -20.0 | 0.0 | 3.9280550532870815 | 0.0009707394231859632 | 3.9289817810058594 | 3.9241018295288086 | 0.0002500000000000001 | 1.0842021724855042e-19 | 0.00025 | 0.00025 | 0.00088581746 | 0.0017567717 | 0.016409054 | 0.00022121534999999999 | 0.06359539761518496 | 0.005375292811606972 | 0.07293013073504026 | 0.05364551693201117 |