coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2026-03-18 07:43:47 +01:00

Author	SHA1	Message	Date
Gourav Roy	779d3694b4	Revert "comment out the part of test in 'test_basic_rl_graph_manager_with_cartpole_dqn_and_repeated_checkpoint_restore' that run in infinite loop" This reverts commit `b8d21c73bf`.	2019-01-02 23:09:09 -08:00
Gourav Roy	6dd7ae2343	Revert "Avoid Memory Leak in Rollout worker" This reverts commit `c694766fad`.	2019-01-02 23:09:09 -08:00
Gourav Roy	2461892c9e	Revert "Updated comments" This reverts commit `740f7937cd`.	2019-01-02 23:09:09 -08:00
Gourav Roy	740f7937cd	Updated comments	2018-12-25 21:52:07 -08:00
x77a1	73c4c850a5	Merge branch 'master' into master	2018-12-25 21:05:41 -08:00
Gourav Roy	c694766fad	Avoid Memory Leak in Rollout worker ISSUE: When we restore checkpoints, we create new nodes in the Tensorflow graph. This happens when we assign new value (op node) to RefVariable in GlobalVariableSaver. With every restore the size of TF graph increases as new nodes are created and old unused nodes are not removed from the graph. This causes the memory leak in restore_checkpoint codepath. FIX: We reset the Tensorflow graph and recreate the Global, Online and Target networks on every restore. This ensures that the old unused nodes in TF graph is dropped.	2018-12-25 21:04:21 -08:00
Gal Novik	56735624ca	Merge pull request #160 from NervanaSystems/tf_version_bump Bump intel optimized tensorflow to 1.12.0	2018-12-25 10:51:58 +02:00
Gal Novik	85fae0f626	Merge branch 'master' into tf_version_bump	2018-12-24 15:50:55 +02:00
Gal Novik	d7c138342b	Merge pull request #170 from NervanaSystems/ci_badge add CI status badge.	2018-12-24 14:39:38 +02:00
Scott Leishman	0823d30839	Merge branch 'master' into tf_version_bump	2018-12-21 10:58:41 -05:00
Scott Leishman	7cda5179c6	add CI status badge.	2018-12-21 10:50:28 -05:00
Zach Dwiel	8e3ee818f8	update circle ci config to match new golden test presets (#167 )	2018-12-21 10:10:31 -05:00
x77a1	02f2db1264	Merge branch 'master' into master	2018-12-17 12:44:27 -08:00
Gal Leibovich	4c914c057c	fix for finding the right filter checkpoint to restore + do not update internal filter state when evaluating + fix SharedRunningStats checkpoint filenames (#147 )	2018-12-17 21:36:27 +02:00
Neta Zmora	b4bc8a476c	Bug fix: when enabling 'heatup_using_network_decisions', we should add the configured noise (#162 ) During heatup we may want to add agent-generated-noise (i.e. not "simple" random noise). This is enabled by setting 'heatup_using_network_decisions' to True. For example: agent_params = DDPGAgentParameters() agent_params.algorithm.heatup_using_network_decisions = True The fix ensures that the correct noise is added not just while in the TRAINING phase, but also during the HEATUP phase. No one has enabled 'heatup_using_network_decisions' yet, which explains why this problem arose only now (in my configuration I do enable 'heatup_using_network_decisions').	2018-12-17 10:08:54 +02:00
gouravr	b8d21c73bf	comment out the part of test in 'test_basic_rl_graph_manager_with_cartpole_dqn_and_repeated_checkpoint_restore' that run in infinite loop	2018-12-16 10:56:40 -08:00
x77a1	1f0980c448	Merge branch 'master' into master	2018-12-16 09:37:00 -08:00
Gal Leibovich	f9ee526536	Fix for issue #128 - circular DQN import (#130 )	2018-12-16 16:06:44 +02:00
gouravr	801aed5e10	Changes to avoid memory leak in rollout worker Currently in rollout worker, we call restore_checkpoint repeatedly to load the latest model in memory. The restore checkpoint functions calls checkpoint_saver. Checkpoint saver uses GlobalVariablesSaver which does not release the references of the previous model variables. This leads to the situation where the memory keeps on growing before crashing the rollout worker. This change avoid using the checkpoint saver in the rollout worker as I believe it is not needed in this code path. Also added a test to easily reproduce the issue using CartPole example. We were also seeing this issue with the AWS DeepRacer implementation and the current implementation avoid the memory leak there as well.	2018-12-15 12:26:31 -08:00
Scott Leishman	aa1dfd7599	Bump intel optimized tensorflow to 1.12.0	2018-12-14 10:15:19 -05:00
zach dwiel	e08accdc22	allow case insensitive selected level name matching	2018-12-11 12:35:30 -05:00
Zach Dwiel	d0248e03c6	add meaningful error message in the event that the action space is not one that can be used (#151 )	2018-12-11 09:09:24 +02:00
Gal Leibovich	f12857a8c7	Docs changes - fixing blogpost links, removing importing all exploration policies (#139 ) * updated docs * removing imports for all exploration policies in __init__ + setting the right blog-post link * small cleanups	2018-12-05 16:16:16 -05:00
Sina Afrooze	155b78b995	Fix warning on import TF or MxNet, when only one of the frameworks is installed (#140 )	2018-12-05 11:52:24 +02:00
Ryan Peach	9e66bb653e	Enable creating custom tensorflow heads, embedders, and middleware. (#135 ) Allowing components to have a path property.	2018-12-05 11:40:06 +02:00
Ryan Peach	3c58ed740b	'CompositeAgent' object has no attribute 'handle_episode_ended' (#136 )	2018-12-05 11:28:16 +02:00
Ryan Peach	436b16016e	Added num_transitions to Memory interface (#137 )	2018-12-05 10:33:25 +02:00
Gal Leibovich	3e281b467b	Update docs_raw README.md (#138 ) * Update README.md	2018-12-03 05:39:17 -08:00
Ryan Peach	28e5b8b612	Minor bugfix on RewardFilter in Readme (#133 )	2018-11-30 16:02:08 -08:00
Scott Leishman	3e67eac9e6	Merge pull request #131 from ryanpeach/patch-2 NoOutputFilter isn't set in tutorial.	2018-11-30 15:55:34 -08:00
Ryan Peach	f678ae7cb8	NoOutputFilter isn't set in tutorial.	2018-11-29 17:50:50 -05:00
Ajay Deshpande	0dd39b20ca	Removing badge	2018-11-28 09:59:08 -08:00
Ajay Deshpande	15fabf6ec3	Removing badge	2018-11-28 09:19:32 -08:00
Gal Novik	533bb43720	Merge pull request #125 from NervanaSystems/0.11.0-release 0.11.0 release	2018-11-28 01:16:01 +02:00
Ajay Deshpande	e877920dd5	Merge pull request #126 from NervanaSystems/ci_updates CI related updates	2018-11-27 14:58:26 -08:00
Scott Leishman	3601d9bc45	CI related updates	2018-11-27 21:53:46 +00:00
Gal Novik	4e0d018d5f	updated algorithms image in README	2018-11-27 23:12:13 +02:00
Gal Novik	fc6604c09c	added missing license headers	2018-11-27 22:43:40 +02:00
Gal Novik	1e618647ab	adding .nojekyll file for github pages to function properly	2018-11-27 22:35:16 +02:00
Gal Novik	7e3aca22eb	Documentation fix	2018-11-27 22:32:46 +02:00
Gal Novik	05c1005e94	Updated README and added .nojekyll file for github pages to work properly	2018-11-27 22:11:28 +02:00
Balaji Subramaniam	d06197f663	Add documentation on distributed Coach. (#158 ) * Added documentation on distributed Coach.	2018-11-27 12:26:15 +02:00
Scott Leishman	e3ecf445e2	ensure we pull from main coach container layers as cache. (#106 )	2018-11-26 17:09:02 -08:00
Gal Leibovich	5674749ed5	workaround for resolving the issue of restoring a multi-node training checkpoint to single worker (#156 )	2018-11-26 00:08:43 +02:00
Gal Leibovich	ab10852ad9	hacky way to resolve the checkpointing issue (#154 )	2018-11-25 16:14:15 +02:00
Gal Leibovich	11170d5ba3	fix dist. tf (#153 )	2018-11-25 14:02:24 +02:00
Sina Afrooze	19a68812f6	Added ONNX compatible broadcast_like function (#152 ) - Also simplified the hybrid_clip implementation.	2018-11-25 11:23:18 +02:00
Balaji Subramaniam	8df425b6e1	Update how save checkpoint secs arg is handled in distributed Coach. (#151 )	2018-11-25 00:05:24 -08:00
Thom Lane	de9b707fe1	Changed run_multiple_seeds to support mxnet. And fix other bugs. (#122 )	2018-11-25 08:33:09 +02:00
Sina Afrooze	77fb561668	Added code to fall back to CPU if GPU not available. (#150 ) - Code will also prune GPU list if more than available GPUs is requested.	2018-11-25 08:32:26 +02:00

1 2 3 4 5 ...

425 Commits