Ajay Deshpande
33dc29ee99
Uploading checkpoint if crd provided ( #191 )
...
* Uploading checkpoint if crd provided
* Changing the calculation of total steps because of a recent change in core_types
Fixes #195
2019-04-26 12:27:33 -07:00
Gal Leibovich
e3c7e526c7
Batch RL ( #238 )
2019-03-19 18:07:09 +02:00
Gal Leibovich
d6158a5cfc
restoring from a checkpoint file ( #247 )
2019-03-17 16:28:09 +02:00
Ajay Deshpande
2c1a9dbf20
Adding framework for multinode tests ( #149 )
...
* Currently runs CartPole_ClippedPPO and Mujoco_ClippedPPO with inverted_pendulum level.
2019-02-26 13:53:12 -08:00
Zach Dwiel
fedb4cbd7c
Cleanup and refactoring ( #171 )
2019-01-15 10:04:53 +02:00
Gal Leibovich
5674749ed5
workaround for resolving the issue of restoring a multi-node training checkpoint to single worker ( #156 )
2018-11-26 00:08:43 +02:00
Gal Leibovich
ab10852ad9
hacky way to resolve the checkpointing issue ( #154 )
2018-11-25 16:14:15 +02:00
Sina Afrooze
5332013bd1
Implement frame-work agnostic rollout and training workers ( #137 )
...
* Added checkpoint state file to coach checkpointing.
* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
2018-11-23 18:05:44 -08:00
Gal Leibovich
a1c56edd98
Fixes for having NumpySharedRunningStats syncing on multi-node ( #139 )
...
1. Having the standard checkpoint prefix in order for the data store to grab it, and sync it to S3.
2. Removing the reference to Redis so that it won't try to pickle that in.
3. Enable restoring a checkpoint into a single-worker run, which was saved by a single-node-multiple-worker run.
2018-11-23 16:11:47 +02:00
Thom Lane
949d91321a
Added explicit environment closing ( #129 )
2018-11-22 14:25:03 +02:00
Sina Afrooze
16cdd9a9c1
Tf checkpointing using saver mechanism ( #134 )
2018-11-22 14:08:10 +02:00
Gal Leibovich
a112ee69f6
Save filters' internal state ( #127 )
...
* save filters internal state
* moving the restore to be made from within NumpyRunningStats
2018-11-20 17:21:48 +02:00
Sina Afrooze
67eb9e4c28
Adding checkpointing framework ( #74 )
...
* Adding checkpointing framework as well as mxnet checkpointing implementation.
- MXNet checkpoint for each network is saved in a separate file.
* Adding checkpoint restore for mxnet to graph-manager
* Add unit-test for get_checkpoint_state()
* Added match.group() to fix unit-test failing on CI
* Added ONNX export support for MXNet
2018-11-19 19:45:49 +02:00
x77a1
4da56b1ff2
Enable setting the data store factory in Graph manager ( #110 )
...
* Enable setting the data store factory in Graph manager
This change enables us to use custom data store for storing and retrieving models.
We currently need this to have use a data store that loads temporary AWS credentials
from disk before calling store or load operations.
* Removed data store factory and introduced data store as a attribute
2018-11-19 08:35:03 -08:00
Gal Leibovich
d4d06aaea6
remove kubernetes dependency ( #117 )
2018-11-18 18:10:22 +02:00
Gal Leibovich
6caf721d1c
Numpy shared running stats ( #97 )
2018-11-18 14:46:40 +02:00
Gal Leibovich
9fd4d55623
Making stop condition optional by using a flag ( #113 )
...
* apply stop condition flag (default: ignore the stop condition)
2018-11-18 13:37:39 +02:00
Balaji Subramaniam
101c55d37d
Handle both Environment Steps and Episodes on the subscriber side. ( #99 )
2018-11-15 14:42:21 -08:00
Ajay Deshpande
fde73ced13
Simulating the act on the trainer. ( #65 )
...
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Itai Caspi
6d40ad1650
update of api docstrings across coach and tutorials [WIP] ( #91 )
...
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Ajay Deshpande
875d6ef017
Adding target reward and target sucess ( #58 )
...
* Adding target reward
* Adding target successs
* Addressing comments
* Using custom_reward_threshold and target_success_rate
* Adding exit message
* Moving success rate to environment
* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Itai Caspi
389c65cbbe
fix for a bug in distributed training that was introduced lately ( #75 )
2018-11-08 16:52:48 +02:00
Sina Afrooze
5fadb9c18e
Adding mxnet components to rl_coach/architectures ( #60 )
...
Adding mxnet components to rl_coach architectures.
- Supports PPO and DQN
- Tested with CartPole_PPO and CarPole_DQN
- Normalizing filters don't work right now (see #49 ) and are disabled in CartPole_PPO preset
- Checkpointing is disabled for MXNet
2018-11-07 17:07:15 +02:00
Itai Caspi
e7a91b4dc3
Fix cmd line arguments handling ( #68 )
...
* refactoring the merging of the task parameters and the command line parameters
* removing some unused command line arguments
* fix for saving checkpoints when not passing through coach.py
2018-11-07 15:47:02 +02:00
Itai Caspi
811152126c
Export graph to ONNX ( #61 )
...
Implements the ONNX graph exporting feature.
Currently does not work for NAF, C51 and A3C_LSTM due to unsupported TF layers in the tf2onnx library.
2018-11-06 10:55:21 +02:00
Balaji Subramaniam
7e7006305a
Integrate coach.py params with distributed Coach. ( #42 )
...
* Integrate coach.py params with distributed Coach.
* Minor improvements
- Use enums instead of constants.
- Reduce code duplication.
- Ask experiment name with timeout.
2018-11-05 09:33:30 -08:00
Ajay Deshpande
16b3e99f37
Setup basic CI flow ( #38 )
...
Adds automated running of unit, integration tests (and optionally longer running tests)
2018-10-24 18:27:58 -07:00
zach dwiel
3ba0df7d07
update GraphManager.act specified return type
2018-10-23 19:58:17 -04:00
Zach Dwiel
700a175902
rename save_checkpoint_secs -> checkpoint_save_secs
2018-10-23 17:10:58 -04:00
Zach Dwiel
9804b033a2
rename save_checkpoint_dir -> checkpoint_save_dir
2018-10-23 17:10:58 -04:00
Zach Dwiel
201a2237a1
restructure looping mechanism inGraphManager
2018-10-23 17:10:58 -04:00
Zach Dwiel
52560a2aae
introduce property GraphManager.current_step_counter
2018-10-23 17:10:04 -04:00
Zach Dwiel
776c94d551
reorder methods in GraphManager
2018-10-23 17:10:04 -04:00
Zach Dwiel
496a516de1
rename GraphManager.sync_graph -> sync
2018-10-23 17:08:29 -04:00
Zach Dwiel
5fee48dcfd
remove argument keep_networks_in_sync from GraphManager.act, and move this feature into the only place that activated it: GraphManager.train_and_act
2018-10-23 17:08:29 -04:00
Zach Dwiel
b2d864a5bd
remove out of date documentation
2018-10-23 17:08:29 -04:00
Zach Dwiel
d32d909238
move only invocation of GraphManager.handle_episode_ended inline
2018-10-23 17:08:29 -04:00
Zach Dwiel
18d84c5037
remove unnecessary timers from GraphManager
2018-10-23 16:58:17 -04:00
Zach Dwiel
cd30efe52e
remove unnecessary test result is None in GraphManager.act
2018-10-23 16:57:43 -04:00
Zach Dwiel
35d67cbd9b
use phase context in GraphManager.evaluate
2018-10-23 16:57:43 -04:00
Zach Dwiel
d3c341147e
simplify GraphManager.act by removing arguments: continue_until_game_over and return_on_game_over
2018-10-23 16:57:43 -04:00
Zach Dwiel
8be980912c
fixed typo from earlier commit
2018-10-23 16:57:43 -04:00
Zach Dwiel
517aac163a
introduce graph_manager.phase_context; make sure that calls to graph_manager.train automatically set training phase
2018-10-23 16:57:43 -04:00
Zach Dwiel
7382a142bb
remove unused steps parameter from GraphManager.train
2018-10-23 16:57:06 -04:00
Zach Dwiel
ad68fa263d
remove property GraphManager.training_start_time
2018-10-23 16:57:05 -04:00
Zach Dwiel
01f3a0594b
remove return values from GraphManager.act
2018-10-23 16:57:05 -04:00
Zach Dwiel
b02f269464
graph_manager:heatup uses total_steps_counters looping mechanism like other loops. graph_manager:act no longer needs to return any values
2018-10-23 16:57:05 -04:00
Ajay Deshpande
0e121c5762
Ignoring redis sub if testing
2018-10-23 16:55:37 -04:00
Ajay Deshpande
a7f5442015
Adding should_train helper and should_train in graph_manager
2018-10-23 16:54:43 -04:00
Balaji Subramaniam
844a5af831
Make distributed coach work end-to-end.
...
- With data store, memory backend and orchestrator interfaces.
2018-10-23 16:54:43 -04:00