Thom Lane
949d91321a
Added explicit environment closing ( #129 )
2018-11-22 14:25:03 +02:00
Sina Afrooze
16cdd9a9c1
Tf checkpointing using saver mechanism ( #134 )
2018-11-22 14:08:10 +02:00
Gal Leibovich
a112ee69f6
Save filters' internal state ( #127 )
...
* save filters internal state
* moving the restore to be made from within NumpyRunningStats
2018-11-20 17:21:48 +02:00
Sina Afrooze
67eb9e4c28
Adding checkpointing framework ( #74 )
...
* Adding checkpointing framework as well as mxnet checkpointing implementation.
- MXNet checkpoint for each network is saved in a separate file.
* Adding checkpoint restore for mxnet to graph-manager
* Add unit-test for get_checkpoint_state()
* Added match.group() to fix unit-test failing on CI
* Added ONNX export support for MXNet
2018-11-19 19:45:49 +02:00
x77a1
4da56b1ff2
Enable setting the data store factory in Graph manager ( #110 )
...
* Enable setting the data store factory in Graph manager
This change enables us to use custom data store for storing and retrieving models.
We currently need this to have use a data store that loads temporary AWS credentials
from disk before calling store or load operations.
* Removed data store factory and introduced data store as a attribute
2018-11-19 08:35:03 -08:00
Gal Leibovich
d4d06aaea6
remove kubernetes dependency ( #117 )
2018-11-18 18:10:22 +02:00
Gal Leibovich
6caf721d1c
Numpy shared running stats ( #97 )
2018-11-18 14:46:40 +02:00
Gal Leibovich
9fd4d55623
Making stop condition optional by using a flag ( #113 )
...
* apply stop condition flag (default: ignore the stop condition)
2018-11-18 13:37:39 +02:00
Balaji Subramaniam
101c55d37d
Handle both Environment Steps and Episodes on the subscriber side. ( #99 )
2018-11-15 14:42:21 -08:00
Ajay Deshpande
fde73ced13
Simulating the act on the trainer. ( #65 )
...
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Itai Caspi
6d40ad1650
update of api docstrings across coach and tutorials [WIP] ( #91 )
...
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Ajay Deshpande
875d6ef017
Adding target reward and target sucess ( #58 )
...
* Adding target reward
* Adding target successs
* Addressing comments
* Using custom_reward_threshold and target_success_rate
* Adding exit message
* Moving success rate to environment
* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Itai Caspi
389c65cbbe
fix for a bug in distributed training that was introduced lately ( #75 )
2018-11-08 16:52:48 +02:00
Sina Afrooze
5fadb9c18e
Adding mxnet components to rl_coach/architectures ( #60 )
...
Adding mxnet components to rl_coach architectures.
- Supports PPO and DQN
- Tested with CartPole_PPO and CarPole_DQN
- Normalizing filters don't work right now (see #49 ) and are disabled in CartPole_PPO preset
- Checkpointing is disabled for MXNet
2018-11-07 17:07:15 +02:00
Itai Caspi
e7a91b4dc3
Fix cmd line arguments handling ( #68 )
...
* refactoring the merging of the task parameters and the command line parameters
* removing some unused command line arguments
* fix for saving checkpoints when not passing through coach.py
2018-11-07 15:47:02 +02:00
Itai Caspi
811152126c
Export graph to ONNX ( #61 )
...
Implements the ONNX graph exporting feature.
Currently does not work for NAF, C51 and A3C_LSTM due to unsupported TF layers in the tf2onnx library.
2018-11-06 10:55:21 +02:00
Balaji Subramaniam
7e7006305a
Integrate coach.py params with distributed Coach. ( #42 )
...
* Integrate coach.py params with distributed Coach.
* Minor improvements
- Use enums instead of constants.
- Reduce code duplication.
- Ask experiment name with timeout.
2018-11-05 09:33:30 -08:00
Ajay Deshpande
16b3e99f37
Setup basic CI flow ( #38 )
...
Adds automated running of unit, integration tests (and optionally longer running tests)
2018-10-24 18:27:58 -07:00
zach dwiel
3ba0df7d07
update GraphManager.act specified return type
2018-10-23 19:58:17 -04:00
Zach Dwiel
700a175902
rename save_checkpoint_secs -> checkpoint_save_secs
2018-10-23 17:10:58 -04:00
Zach Dwiel
9804b033a2
rename save_checkpoint_dir -> checkpoint_save_dir
2018-10-23 17:10:58 -04:00
Zach Dwiel
201a2237a1
restructure looping mechanism inGraphManager
2018-10-23 17:10:58 -04:00
Zach Dwiel
52560a2aae
introduce property GraphManager.current_step_counter
2018-10-23 17:10:04 -04:00
Zach Dwiel
776c94d551
reorder methods in GraphManager
2018-10-23 17:10:04 -04:00
Zach Dwiel
496a516de1
rename GraphManager.sync_graph -> sync
2018-10-23 17:08:29 -04:00
Zach Dwiel
5fee48dcfd
remove argument keep_networks_in_sync from GraphManager.act, and move this feature into the only place that activated it: GraphManager.train_and_act
2018-10-23 17:08:29 -04:00
Zach Dwiel
b2d864a5bd
remove out of date documentation
2018-10-23 17:08:29 -04:00
Zach Dwiel
d32d909238
move only invocation of GraphManager.handle_episode_ended inline
2018-10-23 17:08:29 -04:00
Zach Dwiel
18d84c5037
remove unnecessary timers from GraphManager
2018-10-23 16:58:17 -04:00
Zach Dwiel
cd30efe52e
remove unnecessary test result is None in GraphManager.act
2018-10-23 16:57:43 -04:00
Zach Dwiel
35d67cbd9b
use phase context in GraphManager.evaluate
2018-10-23 16:57:43 -04:00
Zach Dwiel
d3c341147e
simplify GraphManager.act by removing arguments: continue_until_game_over and return_on_game_over
2018-10-23 16:57:43 -04:00
Zach Dwiel
8be980912c
fixed typo from earlier commit
2018-10-23 16:57:43 -04:00
Zach Dwiel
517aac163a
introduce graph_manager.phase_context; make sure that calls to graph_manager.train automatically set training phase
2018-10-23 16:57:43 -04:00
Zach Dwiel
7382a142bb
remove unused steps parameter from GraphManager.train
2018-10-23 16:57:06 -04:00
Zach Dwiel
ad68fa263d
remove property GraphManager.training_start_time
2018-10-23 16:57:05 -04:00
Zach Dwiel
01f3a0594b
remove return values from GraphManager.act
2018-10-23 16:57:05 -04:00
Zach Dwiel
b02f269464
graph_manager:heatup uses total_steps_counters looping mechanism like other loops. graph_manager:act no longer needs to return any values
2018-10-23 16:57:05 -04:00
Ajay Deshpande
0e121c5762
Ignoring redis sub if testing
2018-10-23 16:55:37 -04:00
Ajay Deshpande
a7f5442015
Adding should_train helper and should_train in graph_manager
2018-10-23 16:54:43 -04:00
Balaji Subramaniam
844a5af831
Make distributed coach work end-to-end.
...
- With data store, memory backend and orchestrator interfaces.
2018-10-23 16:54:43 -04:00
Zach Dwiel
9f92064e67
cleanup graph_manager:act
2018-10-23 16:53:32 -04:00
Zach Dwiel
ed3a3b39be
add comments
2018-10-23 16:52:16 -04:00
Zach Dwiel
13d81f65b9
add redis options to training worker
2018-10-23 16:47:46 -04:00
Zach Dwiel
6541bc76b9
working checkpoints
2018-10-23 16:41:57 -04:00
Zach Dwiel
433bc3e27b
standardizing variable access
2018-10-23 16:40:33 -04:00
Gal Leibovich
5a8da90d32
bug-fix for dumping movies (+ small refactoring and rename 'VideoDumpMethod -> 'VideoDumpFilter')
2018-10-21 17:29:10 +03:00
Shadi Endrawis
364168490f
checkpointing fix
2018-10-07 20:06:08 +03:00
Shadi Endrawis
51726a5b80
network_imporvements branch merge
2018-10-02 13:43:36 +03:00
Zach Dwiel
673911ff7f
very minor cleanup
2018-09-12 10:51:56 -04:00