coach/rl_coach/traces/Pendulum_HAC/trace.csv at 72a1d9d426004269997f8b40bdd64f8ee582d91e

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2025-12-29 18:02:29 +01:00

Files

Itai Caspi 72a1d9d426 Itaicaspi/episode reset refactoring (#105 )

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file

2018-09-04 15:07:54 +03:00

1.7 KiB

Raw Blame History

1	Episode #	In Heatup	ER #Transitions	ER #Episodes	Episode Length	Total steps	Epsilon	Shaped Training Reward	Training Reward	Q/Mean	Q/Stdev	Q/Max	Q/Min	actions/Mean	actions/Stdev	actions/Max	actions/Min
2	1	1.0	97.0	1.0	25.0	25.0	0.0
3	2	1.0	194.0	2.0	25.0	50.0	0.0
4	3	0.0	291.0	3.0	25.0	75.0	-0.013705192291281485	-1000.0	-1000.0	-0.51026434	0.22476047	-0.15544460000000002	-0.9295912000000001	2.0812359514743166	3.3284790187301674	12.234674698678914	-0.08146359109321984
5	4	0.0	388.0	4.0	25.0	100.0	-0.02430443169727376	-1000.0	-1000.0	-0.42551166	0.15804265	-0.14439134	-0.71600544	1.7233822661852551	2.691847085563749	10.017017240560527	-0.08547367510074966
6	5	0.0	485.0	5.0	25.0	125.0	0.0	-1000.0	-1000.0	-0.4319562	0.17422763	-0.1460396	-0.7337566999999999	1.742798057982355	2.725836758125469	10.305663257960603	-0.09830476343631744

1.7 KiB Raw Blame History

1.7 KiB

Raw Blame History