1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00
Commit Graph

29 Commits

Author SHA1 Message Date
Gal Leibovich
310d31c227 integration test changes to reach the train part (#254)
* integration test changes to override heatup to 1000 steps +  run each preset for 30 sec (to make sure we reach the train part)

* fixes to failing presets uncovered with this change + changes in the golden testing to properly test BatchRL

* fix for rainbow dqn

* fix to gym_environment (due to a change in Gym 0.12.1) + fix for rainbow DQN + some bug-fix in utils.squeeze_list

* fix for NEC agent
2019-03-27 21:14:19 +02:00
Gal Leibovich
e3c7e526c7 Batch RL (#238) 2019-03-19 18:07:09 +02:00
Gal Novik
10220be9be Adding support for evaluation only mode with predefined number of steps (#225) 2019-03-03 10:03:45 +02:00
shadiendrawis
2b5d1dabe6 ACER algorithm (#184)
* initial ACER commit

* Code cleanup + several fixes

* Q-retrace bug fix + small clean-ups

* added documentation for acer

* ACER benchmarks

* update benchmarks table

* Add nightly running of golden and trace tests. (#202)

Resolves #200

* comment out nightly trace tests until values reset.

* remove redundant observe ignore (#168)

* ensure nightly test env containers exist. (#205)

Also bump integration test timeout

* wxPython removal (#207)

Replacing wxPython with Python's Tkinter.
Also removing the option to choose multiple files as it is unused and causes errors, and fixing the load file/directory spinner.

* Create CONTRIBUTING.md (#210)

* Create CONTRIBUTING.md.  Resolves #188

* run nightly golden tests sequentially. (#217)

Should reduce resource requirements and potential CPU contention but increases
overall execution time.

* tests: added new setup configuration + test args (#211)

- added utils for future tests and conftest
- added test args

* new docs build

* golden test update
2019-02-20 23:52:34 +02:00
Cody Hsieh
bf0a65eefd remove redundant observe ignore (#168) 2019-01-17 14:08:05 -08:00
Zach Dwiel
fedb4cbd7c Cleanup and refactoring (#171) 2019-01-15 10:04:53 +02:00
Gal Leibovich
4c914c057c fix for finding the right filter checkpoint to restore + do not update internal filter state when evaluating + fix SharedRunningStats checkpoint filenames (#147) 2018-12-17 21:36:27 +02:00
Gal Leibovich
f12857a8c7 Docs changes - fixing blogpost links, removing importing all exploration policies (#139)
* updated docs

* removing imports for all exploration policies in __init__ + setting the right blog-post link

* small cleanups
2018-12-05 16:16:16 -05:00
Gal Leibovich
a1c56edd98 Fixes for having NumpySharedRunningStats syncing on multi-node (#139)
1. Having the standard checkpoint prefix in order for the data store to grab it, and sync it to S3.
2. Removing the reference to Redis so that it won't try to pickle that in.
3. Enable restoring a checkpoint into a single-worker run, which was saved by a single-node-multiple-worker run.
2018-11-23 16:11:47 +02:00
Sina Afrooze
87a7848b0a Moved tf.variable_scope and tf.device calls to framework-specific architecture (#136) 2018-11-22 22:52:21 +02:00
Gal Leibovich
a112ee69f6 Save filters' internal state (#127)
* save filters internal state

* moving the restore to be made from within NumpyRunningStats
2018-11-20 17:21:48 +02:00
Sina Afrooze
67eb9e4c28 Adding checkpointing framework (#74)
* Adding checkpointing framework as well as mxnet checkpointing implementation.

- MXNet checkpoint for each network is saved in a separate file.

* Adding checkpoint restore for mxnet to graph-manager

* Add unit-test for get_checkpoint_state()

* Added match.group() to fix unit-test failing on CI

* Added ONNX export support for MXNet
2018-11-19 19:45:49 +02:00
Gal Leibovich
6caf721d1c Numpy shared running stats (#97) 2018-11-18 14:46:40 +02:00
Thom Lane
a0f25034c3 Added average total reward to logging after evaluation phase completes. (#93) 2018-11-16 08:22:00 -08:00
Ajay Deshpande
fde73ced13 Simulating the act on the trainer. (#65)
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Itai Caspi
6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Balaji Subramaniam
a849c17e46 Enable distributed SharedRunningStats (#81)
- Use Redis pub/sub for updating SharedRunningStats.
2018-11-13 19:17:38 +02:00
Ajay Deshpande
875d6ef017 Adding target reward and target sucess (#58)
* Adding target reward

* Adding target successs

* Addressing comments

* Using custom_reward_threshold and target_success_rate

* Adding exit message

* Moving success rate to environment

* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Gal Leibovich
49dea39d34 N-step returns for rainbow (#67)
* n_step returns for rainbow
* Rename CartPole_PPO -> CartPole_ClippedPPO
2018-11-07 18:33:08 +02:00
Ajay Deshpande
9a30c26469 Adding improvements 2018-10-23 19:59:02 -04:00
Ajay Deshpande
b285a02023 Adding parameteres, checking transitions before training 2018-10-23 16:55:37 -04:00
Ajay Deshpande
7f00235ed5 waiting for a new checkpoint if it's available 2018-10-23 16:54:43 -04:00
Ajay Deshpande
a7f5442015 Adding should_train helper and should_train in graph_manager 2018-10-23 16:54:43 -04:00
Ajay Deshpande
6b2de6ba6d Adding initial interface for backend and redis pubsub (#19)
* Adding initial interface for backend and redis pubsub

* Addressing comments, adding super in all memories

* Removing distributed experience replay
2018-10-23 16:51:48 -04:00
Shadi Endrawis
51726a5b80 network_imporvements branch merge 2018-10-02 13:43:36 +03:00
itaicaspi-intel
a9bd1047c4 load and save function for non-episodic replay buffers + carla improvements + network bug fixes 2018-09-12 15:26:16 +03:00
Itai Caspi
72a1d9d426 Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
2018-09-04 15:07:54 +03:00
itaicaspi-intel
658b437079 removing datasets + imports optimization 2018-08-27 10:54:11 +03:00
Gal Novik
19ca5c24b1 pre-release 0.10.0 2018-08-13 17:11:34 +03:00