Thom Lane
81bac050d7
Added Custom Initialisation for MXNet Heads ( #86 )
...
* Added NormalizedRSSInitializer, using same method as TensorFlow backend, but changed name since ‘columns’ have different meaning in dense layer weight matrix in MXNet.
* Added unit test for NormalizedRSSInitializer.
2018-11-16 08:15:43 -08:00
Balaji Subramaniam
101c55d37d
Handle both Environment Steps and Episodes on the subscriber side. ( #99 )
2018-11-15 14:42:21 -08:00
Thom Lane
3358e04a6a
Corrected MXNet's PPO Head for Continuous Action Spaces ( #84 )
...
* Changes required for Continuous PPO Head with MXNet. Used in MountainCarContinuous_ClippedPPO.
* Simplified changes for continuous ppo.
* Cleaned up to avoid duplicate code, and simplified covariance creation.
2018-11-15 13:27:54 -08:00
Ajay Deshpande
fde73ced13
Simulating the act on the trainer. ( #65 )
...
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Itai Caspi
6d40ad1650
update of api docstrings across coach and tutorials [WIP] ( #91 )
...
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Scott Leishman
524f8436a2
create per environment Dockerfiles. ( #70 )
...
* create per environment Dockerfiles.
Adjust CI setup to better parallelize runs.
Fix a couple of issues in golden and trace tests.
Update a few of the docs.
* bugfix in mmc agent.
Also install kubectl for CI, update badge branch.
* remove integration test parallelism.
2018-11-14 07:40:22 -08:00
Balaji Subramaniam
a849c17e46
Enable distributed SharedRunningStats ( #81 )
...
- Use Redis pub/sub for updating SharedRunningStats.
2018-11-13 19:17:38 +02:00
Ajay Deshpande
875d6ef017
Adding target reward and target sucess ( #58 )
...
* Adding target reward
* Adding target successs
* Addressing comments
* Using custom_reward_threshold and target_success_rate
* Adding exit message
* Moving success rate to environment
* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Itai Caspi
0fe583186e
fixing the coach entrypoint after adding the CoachLauncher abstraction ( #92 )
2018-11-12 10:26:49 -08:00
Leo Dirac
2804a7c24f
Refactor launcher to be object-oriented ( #63 )
...
* Import of annoy library uses failed_import mechanism.
2018-11-10 22:10:19 +02:00
Itai Caspi
3fd433ffab
fix ddpg head ( #78 )
2018-11-09 08:17:04 -08:00
Itai Caspi
3a0a1159e9
fixing the dropout rate code ( #72 )
...
addresses issue #53
2018-11-08 16:53:47 +02:00
Itai Caspi
389c65cbbe
fix for a bug in distributed training that was introduced lately ( #75 )
2018-11-08 16:52:48 +02:00
Itai Caspi
83e0b09a6a
adding the missing export_onnx_graph parameter to task parameters ( #73 )
2018-11-08 12:52:42 +02:00
Leo Dirac
8f0415b4cc
Tweak additional_simulator_parameters for easier configuration and better error logging. ( #69 )
2018-11-07 11:01:12 -08:00
Gal Leibovich
49dea39d34
N-step returns for rainbow ( #67 )
...
* n_step returns for rainbow
* Rename CartPole_PPO -> CartPole_ClippedPPO
2018-11-07 18:33:08 +02:00
Itai Caspi
35c477c922
allowing grayscale observations in gym ( #66 )
...
* allowing grayscale observations in gym
2018-11-07 17:08:10 +02:00
Sina Afrooze
5fadb9c18e
Adding mxnet components to rl_coach/architectures ( #60 )
...
Adding mxnet components to rl_coach architectures.
- Supports PPO and DQN
- Tested with CartPole_PPO and CarPole_DQN
- Normalizing filters don't work right now (see #49 ) and are disabled in CartPole_PPO preset
- Checkpointing is disabled for MXNet
2018-11-07 17:07:15 +02:00
Itai Caspi
e7a91b4dc3
Fix cmd line arguments handling ( #68 )
...
* refactoring the merging of the task parameters and the command line parameters
* removing some unused command line arguments
* fix for saving checkpoints when not passing through coach.py
2018-11-07 15:47:02 +02:00
Sina Afrooze
93571306c3
Removed tensorflow specific code in presets ( #59 )
...
* Add generic layer specification for using in presets
* Modify presets to use the generic scheme
2018-11-06 17:39:29 +02:00
Itai Caspi
811152126c
Export graph to ONNX ( #61 )
...
Implements the ONNX graph exporting feature.
Currently does not work for NAF, C51 and A3C_LSTM due to unsupported TF layers in the tf2onnx library.
2018-11-06 10:55:21 +02:00
Leo Dirac
d75df17d97
Modifying ScreenLogger to optionally not output color codes ( #56 )
...
* Modifying ScreenLogger to not output color when configured by new CLI parameter
2018-11-05 15:25:49 -08:00
Balaji Subramaniam
7e7006305a
Integrate coach.py params with distributed Coach. ( #42 )
...
* Integrate coach.py params with distributed Coach.
* Minor improvements
- Use enums instead of constants.
- Reduce code duplication.
- Ask experiment name with timeout.
2018-11-05 09:33:30 -08:00
Sina Afrooze
95b4fc6888
Added ability to switch between tensorflow and mxnet using -f commandline argument. ( #48 )
...
NOTE: tensorflow framework works fine if mxnet is not installed in env, but mxnet will not work if tensorflow is not installed because of the code in network_wrapper.
2018-10-30 15:29:34 -07:00
Sina Afrooze
2046358ab0
Add docstring for architecture ( #47 )
...
- Removed get_model() from architecture because it is only implementation detail of architecture.
2018-10-30 11:02:37 +02:00
Thom Lane
324c67d614
Bug fix: Removed reference to args which is out of scope. Conditioning now performed one level above. ( #54 )
2018-10-29 22:29:22 -07:00
Sina Afrooze
a888226641
Move embedder, middleware, and head parameters to framework agnostic modules. ( #45 )
...
Part of #28
2018-10-29 14:46:40 -07:00
Ajay Deshpande
16b3e99f37
Setup basic CI flow ( #38 )
...
Adds automated running of unit, integration tests (and optionally longer running tests)
2018-10-24 18:27:58 -07:00
Zach Dwiel
2cc6abc3c4
update CartPole_PPO not addressed during rebase ( #41 )
2018-10-24 16:58:25 -07:00
zach dwiel
f835ac902c
fix renaming: save_checkpoint_sec -> checkpoint_save_secs
2018-10-24 10:52:18 -04:00
Ajay Deshpande
fb2721fffa
Removing comments
2018-10-23 19:59:02 -04:00
Ajay Deshpande
9a30c26469
Adding improvements
2018-10-23 19:59:02 -04:00
zach dwiel
3ba0df7d07
update GraphManager.act specified return type
2018-10-23 19:58:17 -04:00
zach dwiel
def76b4cc6
update CartPole_PPO
2018-10-23 19:58:17 -04:00
zach dwiel
3e5e5475de
update training worker
2018-10-23 19:58:17 -04:00
zach dwiel
430ca198e5
convert golden tests into pytest format
2018-10-23 19:58:17 -04:00
zach dwiel
787ab42578
remove extra call to super().store_episode
2018-10-23 19:58:17 -04:00
Zach Dwiel
7220283653
add len(Episode)
2018-10-23 19:58:17 -04:00
Zach Dwiel
700a175902
rename save_checkpoint_secs -> checkpoint_save_secs
2018-10-23 17:10:58 -04:00
Zach Dwiel
9804b033a2
rename save_checkpoint_dir -> checkpoint_save_dir
2018-10-23 17:10:58 -04:00
Zach Dwiel
201a2237a1
restructure looping mechanism inGraphManager
2018-10-23 17:10:58 -04:00
Zach Dwiel
52560a2aae
introduce property GraphManager.current_step_counter
2018-10-23 17:10:04 -04:00
Zach Dwiel
776c94d551
reorder methods in GraphManager
2018-10-23 17:10:04 -04:00
Zach Dwiel
496a516de1
rename GraphManager.sync_graph -> sync
2018-10-23 17:08:29 -04:00
Zach Dwiel
5fee48dcfd
remove argument keep_networks_in_sync from GraphManager.act, and move this feature into the only place that activated it: GraphManager.train_and_act
2018-10-23 17:08:29 -04:00
Zach Dwiel
b2d864a5bd
remove out of date documentation
2018-10-23 17:08:29 -04:00
Zach Dwiel
d32d909238
move only invocation of GraphManager.handle_episode_ended inline
2018-10-23 17:08:29 -04:00
Zach Dwiel
18d84c5037
remove unnecessary timers from GraphManager
2018-10-23 16:58:17 -04:00
Zach Dwiel
cd30efe52e
remove unnecessary test result is None in GraphManager.act
2018-10-23 16:57:43 -04:00
Zach Dwiel
35d67cbd9b
use phase context in GraphManager.evaluate
2018-10-23 16:57:43 -04:00