1
0
mirror of https://github.com/gryf/coach.git synced 2026-04-25 10:01:28 +02:00
Commit Graph

75 Commits

Author SHA1 Message Date
Zach Dwiel 7b0fccb041 Add RedisDataStore (#295)
* GraphManager.set_session also sets self.sess

* make sure that GraphManager.fetch_from_worker uses training phase

* remove unnecessary phase setting in training worker

* reorganize rollout worker

* provide default name to GlobalVariableSaver.__init__ since it isn't really used anyway

* allow dividing TrainingSteps and EnvironmentSteps

* add timestamps to the log

* added redis data store

* conflict merge fix
2019-08-28 21:15:58 +03:00
Gal Leibovich 19ad2d60a7 Batch RL Tutorial (#372) 2019-07-14 18:43:48 +03:00
shadiendrawis 8e812ef82f Coach as a library (#348)
* CoachInterface + tutorial

* Some improvements and typo fixes

* merge tutorial 0 and 4

* typo fix + additional tutorial changes

* tutorial changes

* added reading signals and experiment path argument
2019-06-19 18:05:03 +03:00
Gal Leibovich 9e9c4fd332 Create a dataset using an agent (#306)
Generate a dataset using an agent (allowing to select between this and a random dataset)
2019-05-28 09:34:49 +03:00
Gal Leibovich acceb03ac0 bug fixes for OPE (#311) 2019-05-21 16:39:11 +03:00
Gal Leibovich deb0251367 bug fix following PR #191 (#313) 2019-05-12 13:42:45 -07:00
Gal Leibovich 582921ffe3 OPE: Weighted Importance Sampling (#299) 2019-05-02 19:25:42 +03:00
Ajay Deshpande 33dc29ee99 Uploading checkpoint if crd provided (#191)
* Uploading checkpoint if crd provided
* Changing the calculation of total steps because of a recent change in core_types

Fixes #195
2019-04-26 12:27:33 -07:00
Gal Leibovich 4741b0b916 BCQ variant on top of DDQN (#276)
* kNN based model for predicting which actions to drop
* fix for seeds with batch rl
2019-04-16 17:06:23 +03:00
Gal Leibovich 6e08c55ad5 Enabling-more-agents-for-Batch-RL-and-cleanup (#258)
allowing for the last training batch drawn to be smaller than batch_size + adding support for more agents in BatchRL by adding softmax with temperature to the corresponding heads + adding a CartPole_QR_DQN preset with a golden test + cleanups
2019-03-21 16:10:29 +02:00
Gal Leibovich e3c7e526c7 Batch RL (#238) 2019-03-19 18:07:09 +02:00
Gal Leibovich d6158a5cfc restoring from a checkpoint file (#247) 2019-03-17 16:28:09 +02:00
Ajay Deshpande 2c1a9dbf20 Adding framework for multinode tests (#149)
* Currently runs CartPole_ClippedPPO and Mujoco_ClippedPPO with inverted_pendulum level.
2019-02-26 13:53:12 -08:00
Zach Dwiel fedb4cbd7c Cleanup and refactoring (#171) 2019-01-15 10:04:53 +02:00
Gal Leibovich 5674749ed5 workaround for resolving the issue of restoring a multi-node training checkpoint to single worker (#156) 2018-11-26 00:08:43 +02:00
Gal Leibovich ab10852ad9 hacky way to resolve the checkpointing issue (#154) 2018-11-25 16:14:15 +02:00
Sina Afrooze 5332013bd1 Implement frame-work agnostic rollout and training workers (#137)
* Added checkpoint state file to coach checkpointing.

* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
2018-11-23 18:05:44 -08:00
Gal Leibovich a1c56edd98 Fixes for having NumpySharedRunningStats syncing on multi-node (#139)
1. Having the standard checkpoint prefix in order for the data store to grab it, and sync it to S3.
2. Removing the reference to Redis so that it won't try to pickle that in.
3. Enable restoring a checkpoint into a single-worker run, which was saved by a single-node-multiple-worker run.
2018-11-23 16:11:47 +02:00
Thom Lane 949d91321a Added explicit environment closing (#129) 2018-11-22 14:25:03 +02:00
Sina Afrooze 16cdd9a9c1 Tf checkpointing using saver mechanism (#134) 2018-11-22 14:08:10 +02:00
Gal Leibovich a112ee69f6 Save filters' internal state (#127)
* save filters internal state

* moving the restore to be made from within NumpyRunningStats
2018-11-20 17:21:48 +02:00
Sina Afrooze 67eb9e4c28 Adding checkpointing framework (#74)
* Adding checkpointing framework as well as mxnet checkpointing implementation.

- MXNet checkpoint for each network is saved in a separate file.

* Adding checkpoint restore for mxnet to graph-manager

* Add unit-test for get_checkpoint_state()

* Added match.group() to fix unit-test failing on CI

* Added ONNX export support for MXNet
2018-11-19 19:45:49 +02:00
x77a1 4da56b1ff2 Enable setting the data store factory in Graph manager (#110)
* Enable setting the data store factory in Graph manager

This change enables us to use custom data store for storing and retrieving models.
We currently need this to have use a data store that loads temporary AWS credentials
from disk before calling store or load operations.

* Removed data store factory and introduced data store as a attribute
2018-11-19 08:35:03 -08:00
Gal Leibovich d4d06aaea6 remove kubernetes dependency (#117) 2018-11-18 18:10:22 +02:00
Gal Leibovich 6caf721d1c Numpy shared running stats (#97) 2018-11-18 14:46:40 +02:00
Gal Leibovich 9fd4d55623 Making stop condition optional by using a flag (#113)
* apply stop condition flag (default: ignore the stop condition)
2018-11-18 13:37:39 +02:00
Balaji Subramaniam 101c55d37d Handle both Environment Steps and Episodes on the subscriber side. (#99) 2018-11-15 14:42:21 -08:00
Ajay Deshpande fde73ced13 Simulating the act on the trainer. (#65)
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Itai Caspi 6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Ajay Deshpande 875d6ef017 Adding target reward and target sucess (#58)
* Adding target reward

* Adding target successs

* Addressing comments

* Using custom_reward_threshold and target_success_rate

* Adding exit message

* Moving success rate to environment

* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Itai Caspi 389c65cbbe fix for a bug in distributed training that was introduced lately (#75) 2018-11-08 16:52:48 +02:00
Sina Afrooze 5fadb9c18e Adding mxnet components to rl_coach/architectures (#60)
Adding mxnet components to rl_coach architectures.

- Supports PPO and DQN
- Tested with CartPole_PPO and CarPole_DQN
- Normalizing filters don't work right now (see #49) and are disabled in CartPole_PPO preset
- Checkpointing is disabled for MXNet
2018-11-07 17:07:15 +02:00
Itai Caspi e7a91b4dc3 Fix cmd line arguments handling (#68)
* refactoring the merging of the task parameters and the command line parameters
* removing some unused command line arguments
* fix for saving checkpoints when not passing through coach.py
2018-11-07 15:47:02 +02:00
Itai Caspi 811152126c Export graph to ONNX (#61)
Implements the ONNX graph exporting feature. 
Currently does not work for NAF, C51 and A3C_LSTM due to unsupported TF layers in the tf2onnx library.
2018-11-06 10:55:21 +02:00
Balaji Subramaniam 7e7006305a Integrate coach.py params with distributed Coach. (#42)
* Integrate coach.py params with distributed Coach.
* Minor improvements
- Use enums instead of constants.
- Reduce code duplication.
- Ask experiment name with timeout.
2018-11-05 09:33:30 -08:00
Ajay Deshpande 16b3e99f37 Setup basic CI flow (#38)
Adds automated running of unit, integration tests (and optionally longer running tests)
2018-10-24 18:27:58 -07:00
zach dwiel 3ba0df7d07 update GraphManager.act specified return type 2018-10-23 19:58:17 -04:00
Zach Dwiel 700a175902 rename save_checkpoint_secs -> checkpoint_save_secs 2018-10-23 17:10:58 -04:00
Zach Dwiel 9804b033a2 rename save_checkpoint_dir -> checkpoint_save_dir 2018-10-23 17:10:58 -04:00
Zach Dwiel 201a2237a1 restructure looping mechanism inGraphManager 2018-10-23 17:10:58 -04:00
Zach Dwiel 52560a2aae introduce property GraphManager.current_step_counter 2018-10-23 17:10:04 -04:00
Zach Dwiel 776c94d551 reorder methods in GraphManager 2018-10-23 17:10:04 -04:00
Zach Dwiel 496a516de1 rename GraphManager.sync_graph -> sync 2018-10-23 17:08:29 -04:00
Zach Dwiel 5fee48dcfd remove argument keep_networks_in_sync from GraphManager.act, and move this feature into the only place that activated it: GraphManager.train_and_act 2018-10-23 17:08:29 -04:00
Zach Dwiel b2d864a5bd remove out of date documentation 2018-10-23 17:08:29 -04:00
Zach Dwiel d32d909238 move only invocation of GraphManager.handle_episode_ended inline 2018-10-23 17:08:29 -04:00
Zach Dwiel 18d84c5037 remove unnecessary timers from GraphManager 2018-10-23 16:58:17 -04:00
Zach Dwiel cd30efe52e remove unnecessary test result is None in GraphManager.act 2018-10-23 16:57:43 -04:00
Zach Dwiel 35d67cbd9b use phase context in GraphManager.evaluate 2018-10-23 16:57:43 -04:00
Zach Dwiel d3c341147e simplify GraphManager.act by removing arguments: continue_until_game_over and return_on_game_over 2018-10-23 16:57:43 -04:00