coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2025-12-18 03:30:19 +01:00

Author	SHA1	Message	Date
Thom Lane	949d91321a	Added explicit environment closing (#129 )	2018-11-22 14:25:03 +02:00
Sina Afrooze	16cdd9a9c1	Tf checkpointing using saver mechanism (#134 )	2018-11-22 14:08:10 +02:00
Gal Leibovich	a112ee69f6	Save filters' internal state (#127 ) * save filters internal state * moving the restore to be made from within NumpyRunningStats	2018-11-20 17:21:48 +02:00
Sina Afrooze	67eb9e4c28	Adding checkpointing framework (#74 ) * Adding checkpointing framework as well as mxnet checkpointing implementation. - MXNet checkpoint for each network is saved in a separate file. * Adding checkpoint restore for mxnet to graph-manager * Add unit-test for get_checkpoint_state() * Added match.group() to fix unit-test failing on CI * Added ONNX export support for MXNet	2018-11-19 19:45:49 +02:00
x77a1	4da56b1ff2	Enable setting the data store factory in Graph manager (#110 ) * Enable setting the data store factory in Graph manager This change enables us to use custom data store for storing and retrieving models. We currently need this to have use a data store that loads temporary AWS credentials from disk before calling store or load operations. * Removed data store factory and introduced data store as a attribute	2018-11-19 08:35:03 -08:00
Gal Leibovich	d4d06aaea6	remove kubernetes dependency (#117 )	2018-11-18 18:10:22 +02:00
Gal Leibovich	6caf721d1c	Numpy shared running stats (#97 )	2018-11-18 14:46:40 +02:00
Gal Leibovich	9fd4d55623	Making stop condition optional by using a flag (#113 ) * apply stop condition flag (default: ignore the stop condition)	2018-11-18 13:37:39 +02:00
Balaji Subramaniam	101c55d37d	Handle both Environment Steps and Episodes on the subscriber side. (#99 )	2018-11-15 14:42:21 -08:00
Ajay Deshpande	fde73ced13	Simulating the act on the trainer. (#65 ) * Remove the use of daemon threads for Redis subscribe. * Emulate act and observe on trainer side to update internal vars.	2018-11-15 08:38:58 -08:00
Itai Caspi	6d40ad1650	update of api docstrings across coach and tutorials [WIP] (#91 ) * updating the documentation website * adding the built docs * update of api docstrings across coach and tutorials 0-2 * added some missing api documentation * New Sphinx based documentation	2018-11-15 15:00:13 +02:00
Ajay Deshpande	875d6ef017	Adding target reward and target sucess (#58 ) * Adding target reward * Adding target successs * Addressing comments * Using custom_reward_threshold and target_success_rate * Adding exit message * Moving success rate to environment * Making target_success_rate optional	2018-11-12 15:03:43 -08:00
Itai Caspi	389c65cbbe	fix for a bug in distributed training that was introduced lately (#75 )	2018-11-08 16:52:48 +02:00
Sina Afrooze	5fadb9c18e	Adding mxnet components to rl_coach/architectures (#60 ) Adding mxnet components to rl_coach architectures. - Supports PPO and DQN - Tested with CartPole_PPO and CarPole_DQN - Normalizing filters don't work right now (see #49) and are disabled in CartPole_PPO preset - Checkpointing is disabled for MXNet	2018-11-07 17:07:15 +02:00
Itai Caspi	e7a91b4dc3	Fix cmd line arguments handling (#68 ) * refactoring the merging of the task parameters and the command line parameters * removing some unused command line arguments * fix for saving checkpoints when not passing through coach.py	2018-11-07 15:47:02 +02:00
Itai Caspi	811152126c	Export graph to ONNX (#61 ) Implements the ONNX graph exporting feature. Currently does not work for NAF, C51 and A3C_LSTM due to unsupported TF layers in the tf2onnx library.	2018-11-06 10:55:21 +02:00
Balaji Subramaniam	7e7006305a	Integrate coach.py params with distributed Coach. (#42 ) * Integrate coach.py params with distributed Coach. * Minor improvements - Use enums instead of constants. - Reduce code duplication. - Ask experiment name with timeout.	2018-11-05 09:33:30 -08:00
Ajay Deshpande	16b3e99f37	Setup basic CI flow (#38 ) Adds automated running of unit, integration tests (and optionally longer running tests)	2018-10-24 18:27:58 -07:00
zach dwiel	3ba0df7d07	update GraphManager.act specified return type	2018-10-23 19:58:17 -04:00
Zach Dwiel	700a175902	rename save_checkpoint_secs -> checkpoint_save_secs	2018-10-23 17:10:58 -04:00
Zach Dwiel	9804b033a2	rename save_checkpoint_dir -> checkpoint_save_dir	2018-10-23 17:10:58 -04:00
Zach Dwiel	201a2237a1	restructure looping mechanism inGraphManager	2018-10-23 17:10:58 -04:00
Zach Dwiel	52560a2aae	introduce property GraphManager.current_step_counter	2018-10-23 17:10:04 -04:00
Zach Dwiel	776c94d551	reorder methods in GraphManager	2018-10-23 17:10:04 -04:00
Zach Dwiel	496a516de1	rename GraphManager.sync_graph -> sync	2018-10-23 17:08:29 -04:00
Zach Dwiel	5fee48dcfd	remove argument keep_networks_in_sync from GraphManager.act, and move this feature into the only place that activated it: GraphManager.train_and_act	2018-10-23 17:08:29 -04:00
Zach Dwiel	b2d864a5bd	remove out of date documentation	2018-10-23 17:08:29 -04:00
Zach Dwiel	d32d909238	move only invocation of GraphManager.handle_episode_ended inline	2018-10-23 17:08:29 -04:00
Zach Dwiel	18d84c5037	remove unnecessary timers from GraphManager	2018-10-23 16:58:17 -04:00
Zach Dwiel	cd30efe52e	remove unnecessary test result is None in GraphManager.act	2018-10-23 16:57:43 -04:00
Zach Dwiel	35d67cbd9b	use phase context in GraphManager.evaluate	2018-10-23 16:57:43 -04:00
Zach Dwiel	d3c341147e	simplify GraphManager.act by removing arguments: continue_until_game_over and return_on_game_over	2018-10-23 16:57:43 -04:00
Zach Dwiel	8be980912c	fixed typo from earlier commit	2018-10-23 16:57:43 -04:00
Zach Dwiel	517aac163a	introduce graph_manager.phase_context; make sure that calls to graph_manager.train automatically set training phase	2018-10-23 16:57:43 -04:00
Zach Dwiel	7382a142bb	remove unused steps parameter from GraphManager.train	2018-10-23 16:57:06 -04:00
Zach Dwiel	ad68fa263d	remove property GraphManager.training_start_time	2018-10-23 16:57:05 -04:00
Zach Dwiel	01f3a0594b	remove return values from GraphManager.act	2018-10-23 16:57:05 -04:00
Zach Dwiel	b02f269464	graph_manager:heatup uses total_steps_counters looping mechanism like other loops. graph_manager:act no longer needs to return any values	2018-10-23 16:57:05 -04:00
Ajay Deshpande	0e121c5762	Ignoring redis sub if testing	2018-10-23 16:55:37 -04:00
Ajay Deshpande	a7f5442015	Adding should_train helper and should_train in graph_manager	2018-10-23 16:54:43 -04:00
Balaji Subramaniam	844a5af831	Make distributed coach work end-to-end. - With data store, memory backend and orchestrator interfaces.	2018-10-23 16:54:43 -04:00
Zach Dwiel	9f92064e67	cleanup graph_manager:act	2018-10-23 16:53:32 -04:00
Zach Dwiel	ed3a3b39be	add comments	2018-10-23 16:52:16 -04:00
Zach Dwiel	13d81f65b9	add redis options to training worker	2018-10-23 16:47:46 -04:00
Zach Dwiel	6541bc76b9	working checkpoints	2018-10-23 16:41:57 -04:00
Zach Dwiel	433bc3e27b	standardizing variable access	2018-10-23 16:40:33 -04:00
Gal Leibovich	5a8da90d32	bug-fix for dumping movies (+ small refactoring and rename 'VideoDumpMethod -> 'VideoDumpFilter')	2018-10-21 17:29:10 +03:00
Shadi Endrawis	364168490f	checkpointing fix	2018-10-07 20:06:08 +03:00
Shadi Endrawis	51726a5b80	network_imporvements branch merge	2018-10-02 13:43:36 +03:00
Zach Dwiel	673911ff7f	very minor cleanup	2018-09-12 10:51:56 -04:00

1 2

56 Commits