Scott Leishman
0823d30839
Merge branch 'master' into tf_version_bump
2018-12-21 10:58:41 -05:00
Scott Leishman
7cda5179c6
add CI status badge.
2018-12-21 10:50:28 -05:00
Zach Dwiel
8e3ee818f8
update circle ci config to match new golden test presets ( #167 )
2018-12-21 10:10:31 -05:00
x77a1
02f2db1264
Merge branch 'master' into master
2018-12-17 12:44:27 -08:00
Gal Leibovich
4c914c057c
fix for finding the right filter checkpoint to restore + do not update internal filter state when evaluating + fix SharedRunningStats checkpoint filenames ( #147 )
2018-12-17 21:36:27 +02:00
Neta Zmora
b4bc8a476c
Bug fix: when enabling 'heatup_using_network_decisions', we should add the configured noise ( #162 )
...
During heatup we may want to add agent-generated-noise (i.e. not "simple" random noise).
This is enabled by setting 'heatup_using_network_decisions' to True. For example:
agent_params = DDPGAgentParameters()
agent_params.algorithm.heatup_using_network_decisions = True
The fix ensures that the correct noise is added not just while in the TRAINING phase, but
also during the HEATUP phase.
No one has enabled 'heatup_using_network_decisions' yet, which explains why this problem
arose only now (in my configuration I do enable 'heatup_using_network_decisions').
2018-12-17 10:08:54 +02:00
gouravr
b8d21c73bf
comment out the part of test in 'test_basic_rl_graph_manager_with_cartpole_dqn_and_repeated_checkpoint_restore' that run in infinite loop
2018-12-16 10:56:40 -08:00
x77a1
1f0980c448
Merge branch 'master' into master
2018-12-16 09:37:00 -08:00
Gal Leibovich
f9ee526536
Fix for issue #128 - circular DQN import ( #130 )
2018-12-16 16:06:44 +02:00
gouravr
801aed5e10
Changes to avoid memory leak in rollout worker
...
Currently in rollout worker, we call restore_checkpoint repeatedly to load the latest model in memory. The restore checkpoint functions calls checkpoint_saver. Checkpoint saver uses GlobalVariablesSaver which does not release the references of the previous model variables. This leads to the situation where the memory keeps on growing before crashing the rollout worker.
This change avoid using the checkpoint saver in the rollout worker as I believe it is not needed in this code path.
Also added a test to easily reproduce the issue using CartPole example. We were also seeing this issue with the AWS DeepRacer implementation and the current implementation avoid the memory leak there as well.
2018-12-15 12:26:31 -08:00
Scott Leishman
aa1dfd7599
Bump intel optimized tensorflow to 1.12.0
2018-12-14 10:15:19 -05:00
zach dwiel
e08accdc22
allow case insensitive selected level name matching
2018-12-11 12:35:30 -05:00
Zach Dwiel
d0248e03c6
add meaningful error message in the event that the action space is not one that can be used ( #151 )
2018-12-11 09:09:24 +02:00
Gal Leibovich
f12857a8c7
Docs changes - fixing blogpost links, removing importing all exploration policies ( #139 )
...
* updated docs
* removing imports for all exploration policies in __init__ + setting the right blog-post link
* small cleanups
2018-12-05 16:16:16 -05:00
Sina Afrooze
155b78b995
Fix warning on import TF or MxNet, when only one of the frameworks is installed ( #140 )
2018-12-05 11:52:24 +02:00
Ryan Peach
9e66bb653e
Enable creating custom tensorflow heads, embedders, and middleware. ( #135 )
...
Allowing components to have a path property.
2018-12-05 11:40:06 +02:00
Ryan Peach
3c58ed740b
'CompositeAgent' object has no attribute 'handle_episode_ended' ( #136 )
2018-12-05 11:28:16 +02:00
Ryan Peach
436b16016e
Added num_transitions to Memory interface ( #137 )
2018-12-05 10:33:25 +02:00
Gal Leibovich
3e281b467b
Update docs_raw README.md ( #138 )
...
* Update README.md
2018-12-03 05:39:17 -08:00
Ryan Peach
28e5b8b612
Minor bugfix on RewardFilter in Readme ( #133 )
2018-11-30 16:02:08 -08:00
Scott Leishman
3e67eac9e6
Merge pull request #131 from ryanpeach/patch-2
...
NoOutputFilter isn't set in tutorial.
2018-11-30 15:55:34 -08:00
Ryan Peach
f678ae7cb8
NoOutputFilter isn't set in tutorial.
2018-11-29 17:50:50 -05:00
Ajay Deshpande
0dd39b20ca
Removing badge
2018-11-28 09:59:08 -08:00
Ajay Deshpande
15fabf6ec3
Removing badge
2018-11-28 09:19:32 -08:00
Gal Novik
533bb43720
Merge pull request #125 from NervanaSystems/0.11.0-release
...
0.11.0 release
2018-11-28 01:16:01 +02:00
Ajay Deshpande
e877920dd5
Merge pull request #126 from NervanaSystems/ci_updates
...
CI related updates
2018-11-27 14:58:26 -08:00
Scott Leishman
3601d9bc45
CI related updates
2018-11-27 21:53:46 +00:00
Gal Novik
4e0d018d5f
updated algorithms image in README
2018-11-27 23:12:13 +02:00
Gal Novik
fc6604c09c
added missing license headers
2018-11-27 22:43:40 +02:00
Gal Novik
1e618647ab
adding .nojekyll file for github pages to function properly
2018-11-27 22:35:16 +02:00
Gal Novik
7e3aca22eb
Documentation fix
2018-11-27 22:32:46 +02:00
Gal Novik
05c1005e94
Updated README and added .nojekyll file for github pages to work properly
2018-11-27 22:11:28 +02:00
Balaji Subramaniam
d06197f663
Add documentation on distributed Coach. ( #158 )
...
* Added documentation on distributed Coach.
2018-11-27 12:26:15 +02:00
Scott Leishman
e3ecf445e2
ensure we pull from main coach container layers as cache. ( #106 )
2018-11-26 17:09:02 -08:00
Gal Leibovich
5674749ed5
workaround for resolving the issue of restoring a multi-node training checkpoint to single worker ( #156 )
2018-11-26 00:08:43 +02:00
Gal Leibovich
ab10852ad9
hacky way to resolve the checkpointing issue ( #154 )
2018-11-25 16:14:15 +02:00
Gal Leibovich
11170d5ba3
fix dist. tf ( #153 )
2018-11-25 14:02:24 +02:00
Sina Afrooze
19a68812f6
Added ONNX compatible broadcast_like function ( #152 )
...
- Also simplified the hybrid_clip implementation.
2018-11-25 11:23:18 +02:00
Balaji Subramaniam
8df425b6e1
Update how save checkpoint secs arg is handled in distributed Coach. ( #151 )
2018-11-25 00:05:24 -08:00
Thom Lane
de9b707fe1
Changed run_multiple_seeds to support mxnet. And fix other bugs. ( #122 )
2018-11-25 08:33:09 +02:00
Sina Afrooze
77fb561668
Added code to fall back to CPU if GPU not available. ( #150 )
...
- Code will also prune GPU list if more than available GPUs is requested.
2018-11-25 08:32:26 +02:00
Sina Afrooze
7d25477942
Add observation_space_type to GymEnvironmentParameters so that it is possible to explicitly state that in presets. ( #145 )
2018-11-25 07:11:48 +02:00
Balaji Subramaniam
bf2036b284
S3 optimization - save only the latest checkpoint. ( #148 )
2018-11-23 22:17:36 -08:00
Balaji Subramaniam
13d2679af4
Sync experiment dir, videos, gifs to S3. ( #147 )
2018-11-23 20:52:12 -08:00
Sina Afrooze
5332013bd1
Implement frame-work agnostic rollout and training workers ( #137 )
...
* Added checkpoint state file to coach checkpointing.
* Removed TF specific code from rollout_worker, training_worker, and s3_data_store
2018-11-23 18:05:44 -08:00
Ajay Deshpande
4a6c404070
Adding worker logs and plumbed task_parameters to distributed coach ( #130 )
2018-11-23 15:35:11 -08:00
Gal Leibovich
2b4c9c6774
Removing grarph_manager param ( #141 )
2018-11-23 11:42:54 -08:00
Gal Leibovich
a1c56edd98
Fixes for having NumpySharedRunningStats syncing on multi-node ( #139 )
...
1. Having the standard checkpoint prefix in order for the data store to grab it, and sync it to S3.
2. Removing the reference to Redis so that it won't try to pickle that in.
3. Enable restoring a checkpoint into a single-worker run, which was saved by a single-node-multiple-worker run.
2018-11-23 16:11:47 +02:00
Sina Afrooze
87a7848b0a
Moved tf.variable_scope and tf.device calls to framework-specific architecture ( #136 )
2018-11-22 22:52:21 +02:00
shadiendrawis
559969d3dd
disabled loading for target weights ( #138 )
...
* Update savers.py
* disabled loading for target weights
2018-11-22 18:15:52 +02:00