1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-18 03:30:19 +01:00
Commit Graph

323 Commits

Author SHA1 Message Date
Balaji Subramaniam
13d2679af4 Sync experiment dir, videos, gifs to S3. (#147) 2018-11-23 20:52:12 -08:00
Sina Afrooze
5332013bd1 Implement frame-work agnostic rollout and training workers (#137)
* Added checkpoint state file to coach checkpointing.

* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
2018-11-23 18:05:44 -08:00
Ajay Deshpande
4a6c404070 Adding worker logs and plumbed task_parameters to distributed coach (#130) 2018-11-23 15:35:11 -08:00
Gal Leibovich
2b4c9c6774 Removing grarph_manager param (#141) 2018-11-23 11:42:54 -08:00
Gal Leibovich
a1c56edd98 Fixes for having NumpySharedRunningStats syncing on multi-node (#139)
1. Having the standard checkpoint prefix in order for the data store to grab it, and sync it to S3.
2. Removing the reference to Redis so that it won't try to pickle that in.
3. Enable restoring a checkpoint into a single-worker run, which was saved by a single-node-multiple-worker run.
2018-11-23 16:11:47 +02:00
Sina Afrooze
87a7848b0a Moved tf.variable_scope and tf.device calls to framework-specific architecture (#136) 2018-11-22 22:52:21 +02:00
shadiendrawis
559969d3dd disabled loading for target weights (#138)
* Update savers.py

* disabled loading for target weights
2018-11-22 18:15:52 +02:00
Thom Lane
949d91321a Added explicit environment closing (#129) 2018-11-22 14:25:03 +02:00
Sina Afrooze
16cdd9a9c1 Tf checkpointing using saver mechanism (#134) 2018-11-22 14:08:10 +02:00
Cody Hsieh
dd18959e53 Don't download when checkpoint files are already present (#109)
* add check if checkpoint file present
2018-11-21 15:32:53 -08:00
shadiendrawis
b94239234a Removed TF warning when training in a distributed setting (#133)
* removed TF warning when training in a distributed setting and changed package version

* revert version back to 0.11.0
2018-11-21 16:09:04 +02:00
Gal Leibovich
a112ee69f6 Save filters' internal state (#127)
* save filters internal state

* moving the restore to be made from within NumpyRunningStats
2018-11-20 17:21:48 +02:00
Sina Afrooze
67eb9e4c28 Adding checkpointing framework (#74)
* Adding checkpointing framework as well as mxnet checkpointing implementation.

- MXNet checkpoint for each network is saved in a separate file.

* Adding checkpoint restore for mxnet to graph-manager

* Add unit-test for get_checkpoint_state()

* Added match.group() to fix unit-test failing on CI

* Added ONNX export support for MXNet
2018-11-19 19:45:49 +02:00
x77a1
4da56b1ff2 Enable setting the data store factory in Graph manager (#110)
* Enable setting the data store factory in Graph manager

This change enables us to use custom data store for storing and retrieving models.
We currently need this to have use a data store that loads temporary AWS credentials
from disk before calling store or load operations.

* Removed data store factory and introduced data store as a attribute
2018-11-19 08:35:03 -08:00
Sina Afrooze
67a90ee87e Add tensor input type for arbitrary dimensional observation (#125)
* Allow arbitrary dimensional observation (non vector or image)
* Added creating PlanarMapsObservationSpace to GymEnvironment when number of channels is not 1 or 3
2018-11-19 16:41:12 +02:00
Thom Lane
7ba1a4393f Channel order transpose, for image embedder. Updated unit test. (#87) 2018-11-19 15:39:03 +02:00
shadiendrawis
ff816b347d aws pip package (#118)
Added support for a rl-coach-slim package.
2018-11-19 14:00:16 +02:00
Gal Novik
3817cefb12 removing box2d and atari requirements (#124) 2018-11-19 13:42:08 +02:00
Thom Lane
9210909050 Added MXNet to arg docs. (#121) 2018-11-19 11:31:28 +02:00
Gal Leibovich
d4d06aaea6 remove kubernetes dependency (#117) 2018-11-18 18:10:22 +02:00
Gal Leibovich
430e286c56 muting pygame's hello message (#116) 2018-11-18 18:02:55 +02:00
Gal Leibovich
ce85c8e8c3 Removing Egreedy from CartPole_ClippedPPO. ClippedPPO's default exploration policy is to be used instead. (#115) 2018-11-18 16:36:34 +02:00
Gal Leibovich
6caf721d1c Numpy shared running stats (#97) 2018-11-18 14:46:40 +02:00
Gal Novik
e1fa6e9681 roboschool: updating envs to v1, fixing rendering (#112) 2018-11-18 13:38:10 +02:00
Gal Leibovich
9fd4d55623 Making stop condition optional by using a flag (#113)
* apply stop condition flag (default: ignore the stop condition)
2018-11-18 13:37:39 +02:00
Gal Leibovich
449bcfb4e1 summing head losses instead of taking the mean (#98) 2018-11-18 12:20:00 +02:00
Zach Dwiel
5b11fa5656 check for local mujoco key in build process (#105)
approved by scott.
2018-11-18 10:57:30 +02:00
Balaji Subramaniam
dea1826658 Re-enable NFS data store. (#101) 2018-11-16 13:55:33 -08:00
Thom Lane
a0f25034c3 Added average total reward to logging after evaluation phase completes. (#93) 2018-11-16 08:22:00 -08:00
Thom Lane
81bac050d7 Added Custom Initialisation for MXNet Heads (#86)
* Added NormalizedRSSInitializer, using same method as TensorFlow backend, but changed name since ‘columns’ have different meaning in dense layer weight matrix in MXNet.

* Added unit test for NormalizedRSSInitializer.
2018-11-16 08:15:43 -08:00
Balaji Subramaniam
101c55d37d Handle both Environment Steps and Episodes on the subscriber side. (#99) 2018-11-15 14:42:21 -08:00
Thom Lane
3358e04a6a Corrected MXNet's PPO Head for Continuous Action Spaces (#84)
* Changes required for Continuous PPO Head with MXNet. Used in MountainCarContinuous_ClippedPPO.

* Simplified changes for continuous ppo.

* Cleaned up to avoid duplicate code, and simplified covariance creation.
2018-11-15 13:27:54 -08:00
Ajay Deshpande
fde73ced13 Simulating the act on the trainer. (#65)
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Scott Leishman
fe6857eabd broaden supported package versions (#50)
* broaden supported package versions.
* fix mxnet variants.
Also back-out tuple deprecation change introduced in prior commit.
* correct CI image deployment on master branch merge.
2018-11-15 15:29:49 +02:00
Itai Caspi
6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Scott Leishman
524f8436a2 create per environment Dockerfiles. (#70)
* create per environment Dockerfiles.

Adjust CI setup to better parallelize runs.
Fix a couple of issues in golden and trace tests.
Update a few of the docs.

* bugfix in mmc agent.

Also install kubectl for CI, update badge branch.

* remove integration test parallelism.
2018-11-14 07:40:22 -08:00
Balaji Subramaniam
a849c17e46 Enable distributed SharedRunningStats (#81)
- Use Redis pub/sub for updating SharedRunningStats.
2018-11-13 19:17:38 +02:00
Ajay Deshpande
875d6ef017 Adding target reward and target sucess (#58)
* Adding target reward

* Adding target successs

* Addressing comments

* Using custom_reward_threshold and target_success_rate

* Adding exit message

* Moving success rate to environment

* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Itai Caspi
0fe583186e fixing the coach entrypoint after adding the CoachLauncher abstraction (#92) 2018-11-12 10:26:49 -08:00
Leo Dirac
2804a7c24f Refactor launcher to be object-oriented (#63)
* Import of annoy library uses failed_import mechanism.
2018-11-10 22:10:19 +02:00
Itai Caspi
3fd433ffab fix ddpg head (#78) 2018-11-09 08:17:04 -08:00
Itai Caspi
3a0a1159e9 fixing the dropout rate code (#72)
addresses issue #53
2018-11-08 16:53:47 +02:00
Itai Caspi
389c65cbbe fix for a bug in distributed training that was introduced lately (#75) 2018-11-08 16:52:48 +02:00
Itai Caspi
83e0b09a6a adding the missing export_onnx_graph parameter to task parameters (#73) 2018-11-08 12:52:42 +02:00
Leo Dirac
8f0415b4cc Tweak additional_simulator_parameters for easier configuration and better error logging. (#69) 2018-11-07 11:01:12 -08:00
Gal Leibovich
49dea39d34 N-step returns for rainbow (#67)
* n_step returns for rainbow
* Rename CartPole_PPO -> CartPole_ClippedPPO
2018-11-07 18:33:08 +02:00
Itai Caspi
35c477c922 allowing grayscale observations in gym (#66)
* allowing grayscale observations in gym
2018-11-07 17:08:10 +02:00
Sina Afrooze
5fadb9c18e Adding mxnet components to rl_coach/architectures (#60)
Adding mxnet components to rl_coach architectures.

- Supports PPO and DQN
- Tested with CartPole_PPO and CarPole_DQN
- Normalizing filters don't work right now (see #49) and are disabled in CartPole_PPO preset
- Checkpointing is disabled for MXNet
2018-11-07 17:07:15 +02:00
Itai Caspi
e7a91b4dc3 Fix cmd line arguments handling (#68)
* refactoring the merging of the task parameters and the command line parameters
* removing some unused command line arguments
* fix for saving checkpoints when not passing through coach.py
2018-11-07 15:47:02 +02:00
Sina Afrooze
93571306c3 Removed tensorflow specific code in presets (#59)
* Add generic layer specification for using in presets

* Modify presets to use the generic scheme
2018-11-06 17:39:29 +02:00