1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 11:40:18 +01:00
Commit Graph

312 Commits

Author SHA1 Message Date
Gal Leibovich
a112ee69f6 Save filters' internal state (#127)
* save filters internal state

* moving the restore to be made from within NumpyRunningStats
2018-11-20 17:21:48 +02:00
Sina Afrooze
67eb9e4c28 Adding checkpointing framework (#74)
* Adding checkpointing framework as well as mxnet checkpointing implementation.

- MXNet checkpoint for each network is saved in a separate file.

* Adding checkpoint restore for mxnet to graph-manager

* Add unit-test for get_checkpoint_state()

* Added match.group() to fix unit-test failing on CI

* Added ONNX export support for MXNet
2018-11-19 19:45:49 +02:00
x77a1
4da56b1ff2 Enable setting the data store factory in Graph manager (#110)
* Enable setting the data store factory in Graph manager

This change enables us to use custom data store for storing and retrieving models.
We currently need this to have use a data store that loads temporary AWS credentials
from disk before calling store or load operations.

* Removed data store factory and introduced data store as a attribute
2018-11-19 08:35:03 -08:00
Sina Afrooze
67a90ee87e Add tensor input type for arbitrary dimensional observation (#125)
* Allow arbitrary dimensional observation (non vector or image)
* Added creating PlanarMapsObservationSpace to GymEnvironment when number of channels is not 1 or 3
2018-11-19 16:41:12 +02:00
Thom Lane
7ba1a4393f Channel order transpose, for image embedder. Updated unit test. (#87) 2018-11-19 15:39:03 +02:00
shadiendrawis
ff816b347d aws pip package (#118)
Added support for a rl-coach-slim package.
2018-11-19 14:00:16 +02:00
Gal Novik
3817cefb12 removing box2d and atari requirements (#124) 2018-11-19 13:42:08 +02:00
Thom Lane
9210909050 Added MXNet to arg docs. (#121) 2018-11-19 11:31:28 +02:00
Gal Leibovich
d4d06aaea6 remove kubernetes dependency (#117) 2018-11-18 18:10:22 +02:00
Gal Leibovich
430e286c56 muting pygame's hello message (#116) 2018-11-18 18:02:55 +02:00
Gal Leibovich
ce85c8e8c3 Removing Egreedy from CartPole_ClippedPPO. ClippedPPO's default exploration policy is to be used instead. (#115) 2018-11-18 16:36:34 +02:00
Gal Leibovich
6caf721d1c Numpy shared running stats (#97) 2018-11-18 14:46:40 +02:00
Gal Novik
e1fa6e9681 roboschool: updating envs to v1, fixing rendering (#112) 2018-11-18 13:38:10 +02:00
Gal Leibovich
9fd4d55623 Making stop condition optional by using a flag (#113)
* apply stop condition flag (default: ignore the stop condition)
2018-11-18 13:37:39 +02:00
Gal Leibovich
449bcfb4e1 summing head losses instead of taking the mean (#98) 2018-11-18 12:20:00 +02:00
Zach Dwiel
5b11fa5656 check for local mujoco key in build process (#105)
approved by scott.
2018-11-18 10:57:30 +02:00
Balaji Subramaniam
dea1826658 Re-enable NFS data store. (#101) 2018-11-16 13:55:33 -08:00
Thom Lane
a0f25034c3 Added average total reward to logging after evaluation phase completes. (#93) 2018-11-16 08:22:00 -08:00
Thom Lane
81bac050d7 Added Custom Initialisation for MXNet Heads (#86)
* Added NormalizedRSSInitializer, using same method as TensorFlow backend, but changed name since ‘columns’ have different meaning in dense layer weight matrix in MXNet.

* Added unit test for NormalizedRSSInitializer.
2018-11-16 08:15:43 -08:00
Balaji Subramaniam
101c55d37d Handle both Environment Steps and Episodes on the subscriber side. (#99) 2018-11-15 14:42:21 -08:00
Thom Lane
3358e04a6a Corrected MXNet's PPO Head for Continuous Action Spaces (#84)
* Changes required for Continuous PPO Head with MXNet. Used in MountainCarContinuous_ClippedPPO.

* Simplified changes for continuous ppo.

* Cleaned up to avoid duplicate code, and simplified covariance creation.
2018-11-15 13:27:54 -08:00
Ajay Deshpande
fde73ced13 Simulating the act on the trainer. (#65)
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Scott Leishman
fe6857eabd broaden supported package versions (#50)
* broaden supported package versions.
* fix mxnet variants.
Also back-out tuple deprecation change introduced in prior commit.
* correct CI image deployment on master branch merge.
2018-11-15 15:29:49 +02:00
Itai Caspi
6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Scott Leishman
524f8436a2 create per environment Dockerfiles. (#70)
* create per environment Dockerfiles.

Adjust CI setup to better parallelize runs.
Fix a couple of issues in golden and trace tests.
Update a few of the docs.

* bugfix in mmc agent.

Also install kubectl for CI, update badge branch.

* remove integration test parallelism.
2018-11-14 07:40:22 -08:00
Balaji Subramaniam
a849c17e46 Enable distributed SharedRunningStats (#81)
- Use Redis pub/sub for updating SharedRunningStats.
2018-11-13 19:17:38 +02:00
Ajay Deshpande
875d6ef017 Adding target reward and target sucess (#58)
* Adding target reward

* Adding target successs

* Addressing comments

* Using custom_reward_threshold and target_success_rate

* Adding exit message

* Moving success rate to environment

* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Itai Caspi
0fe583186e fixing the coach entrypoint after adding the CoachLauncher abstraction (#92) 2018-11-12 10:26:49 -08:00
Leo Dirac
2804a7c24f Refactor launcher to be object-oriented (#63)
* Import of annoy library uses failed_import mechanism.
2018-11-10 22:10:19 +02:00
Itai Caspi
3fd433ffab fix ddpg head (#78) 2018-11-09 08:17:04 -08:00
Itai Caspi
3a0a1159e9 fixing the dropout rate code (#72)
addresses issue #53
2018-11-08 16:53:47 +02:00
Itai Caspi
389c65cbbe fix for a bug in distributed training that was introduced lately (#75) 2018-11-08 16:52:48 +02:00
Itai Caspi
83e0b09a6a adding the missing export_onnx_graph parameter to task parameters (#73) 2018-11-08 12:52:42 +02:00
Leo Dirac
8f0415b4cc Tweak additional_simulator_parameters for easier configuration and better error logging. (#69) 2018-11-07 11:01:12 -08:00
Gal Leibovich
49dea39d34 N-step returns for rainbow (#67)
* n_step returns for rainbow
* Rename CartPole_PPO -> CartPole_ClippedPPO
2018-11-07 18:33:08 +02:00
Itai Caspi
35c477c922 allowing grayscale observations in gym (#66)
* allowing grayscale observations in gym
2018-11-07 17:08:10 +02:00
Sina Afrooze
5fadb9c18e Adding mxnet components to rl_coach/architectures (#60)
Adding mxnet components to rl_coach architectures.

- Supports PPO and DQN
- Tested with CartPole_PPO and CarPole_DQN
- Normalizing filters don't work right now (see #49) and are disabled in CartPole_PPO preset
- Checkpointing is disabled for MXNet
2018-11-07 17:07:15 +02:00
Itai Caspi
e7a91b4dc3 Fix cmd line arguments handling (#68)
* refactoring the merging of the task parameters and the command line parameters
* removing some unused command line arguments
* fix for saving checkpoints when not passing through coach.py
2018-11-07 15:47:02 +02:00
Sina Afrooze
93571306c3 Removed tensorflow specific code in presets (#59)
* Add generic layer specification for using in presets

* Modify presets to use the generic scheme
2018-11-06 17:39:29 +02:00
Itai Caspi
811152126c Export graph to ONNX (#61)
Implements the ONNX graph exporting feature. 
Currently does not work for NAF, C51 and A3C_LSTM due to unsupported TF layers in the tf2onnx library.
2018-11-06 10:55:21 +02:00
Leo Dirac
d75df17d97 Modifying ScreenLogger to optionally not output color codes (#56)
* Modifying ScreenLogger to not output color when configured by new CLI parameter
2018-11-05 15:25:49 -08:00
Balaji Subramaniam
7e7006305a Integrate coach.py params with distributed Coach. (#42)
* Integrate coach.py params with distributed Coach.
* Minor improvements
- Use enums instead of constants.
- Reduce code duplication.
- Ask experiment name with timeout.
2018-11-05 09:33:30 -08:00
Sina Afrooze
95b4fc6888 Added ability to switch between tensorflow and mxnet using -f commandline argument. (#48)
NOTE: tensorflow framework works fine if mxnet is not installed in env, but mxnet will not work if tensorflow is not installed because of the code in network_wrapper.
2018-10-30 15:29:34 -07:00
Sina Afrooze
2046358ab0 Add docstring for architecture (#47)
- Removed get_model() from architecture because it is only implementation detail of architecture.
2018-10-30 11:02:37 +02:00
Thom Lane
324c67d614 Bug fix: Removed reference to args which is out of scope. Conditioning now performed one level above. (#54) 2018-10-29 22:29:22 -07:00
Sina Afrooze
a888226641 Move embedder, middleware, and head parameters to framework agnostic modules. (#45)
Part of #28
2018-10-29 14:46:40 -07:00
Ajay Deshpande
16b3e99f37 Setup basic CI flow (#38)
Adds automated running of unit, integration tests (and optionally longer running tests)
2018-10-24 18:27:58 -07:00
Zach Dwiel
2cc6abc3c4 update CartPole_PPO not addressed during rebase (#41) 2018-10-24 16:58:25 -07:00
zach dwiel
f835ac902c fix renaming: save_checkpoint_sec -> checkpoint_save_secs 2018-10-24 10:52:18 -04:00
Ajay Deshpande
78cf25c09a Removing mjkey, should be injected from env var 2018-10-23 19:59:02 -04:00