coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2026-03-19 00:13:46 +01:00

Author	SHA1	Message	Date
shadiendrawis	ff816b347d	aws pip package (#118 ) Added support for a rl-coach-slim package.	2018-11-19 14:00:16 +02:00
Gal Novik	3817cefb12	removing box2d and atari requirements (#124 )	2018-11-19 13:42:08 +02:00
Thom Lane	9210909050	Added MXNet to arg docs. (#121 )	2018-11-19 11:31:28 +02:00
Gal Leibovich	d4d06aaea6	remove kubernetes dependency (#117 )	2018-11-18 18:10:22 +02:00
Gal Leibovich	430e286c56	muting pygame's hello message (#116 )	2018-11-18 18:02:55 +02:00
Gal Leibovich	ce85c8e8c3	Removing Egreedy from CartPole_ClippedPPO. ClippedPPO's default exploration policy is to be used instead. (#115 )	2018-11-18 16:36:34 +02:00
Gal Leibovich	6caf721d1c	Numpy shared running stats (#97 )	2018-11-18 14:46:40 +02:00
Gal Novik	e1fa6e9681	roboschool: updating envs to v1, fixing rendering (#112 )	2018-11-18 13:38:10 +02:00
Gal Leibovich	9fd4d55623	Making stop condition optional by using a flag (#113 ) * apply stop condition flag (default: ignore the stop condition)	2018-11-18 13:37:39 +02:00
Gal Leibovich	449bcfb4e1	summing head losses instead of taking the mean (#98 )	2018-11-18 12:20:00 +02:00
Zach Dwiel	5b11fa5656	check for local mujoco key in build process (#105 ) approved by scott.	2018-11-18 10:57:30 +02:00
Balaji Subramaniam	dea1826658	Re-enable NFS data store. (#101 )	2018-11-16 13:55:33 -08:00
Thom Lane	a0f25034c3	Added average total reward to logging after evaluation phase completes. (#93 )	2018-11-16 08:22:00 -08:00
Thom Lane	81bac050d7	Added Custom Initialisation for MXNet Heads (#86 ) * Added NormalizedRSSInitializer, using same method as TensorFlow backend, but changed name since ‘columns’ have different meaning in dense layer weight matrix in MXNet. * Added unit test for NormalizedRSSInitializer.	2018-11-16 08:15:43 -08:00
Balaji Subramaniam	101c55d37d	Handle both Environment Steps and Episodes on the subscriber side. (#99 )	2018-11-15 14:42:21 -08:00
Thom Lane	3358e04a6a	Corrected MXNet's PPO Head for Continuous Action Spaces (#84 ) * Changes required for Continuous PPO Head with MXNet. Used in MountainCarContinuous_ClippedPPO. * Simplified changes for continuous ppo. * Cleaned up to avoid duplicate code, and simplified covariance creation.	2018-11-15 13:27:54 -08:00
Ajay Deshpande	fde73ced13	Simulating the act on the trainer. (#65 ) * Remove the use of daemon threads for Redis subscribe. * Emulate act and observe on trainer side to update internal vars.	2018-11-15 08:38:58 -08:00
Scott Leishman	fe6857eabd	broaden supported package versions (#50 ) * broaden supported package versions. * fix mxnet variants. Also back-out tuple deprecation change introduced in prior commit. * correct CI image deployment on master branch merge.	2018-11-15 15:29:49 +02:00
Itai Caspi	6d40ad1650	update of api docstrings across coach and tutorials [WIP] (#91 ) * updating the documentation website * adding the built docs * update of api docstrings across coach and tutorials 0-2 * added some missing api documentation * New Sphinx based documentation	2018-11-15 15:00:13 +02:00
Scott Leishman	524f8436a2	create per environment Dockerfiles. (#70 ) * create per environment Dockerfiles. Adjust CI setup to better parallelize runs. Fix a couple of issues in golden and trace tests. Update a few of the docs. * bugfix in mmc agent. Also install kubectl for CI, update badge branch. * remove integration test parallelism.	2018-11-14 07:40:22 -08:00
Balaji Subramaniam	a849c17e46	Enable distributed SharedRunningStats (#81 ) - Use Redis pub/sub for updating SharedRunningStats.	2018-11-13 19:17:38 +02:00
Ajay Deshpande	875d6ef017	Adding target reward and target sucess (#58 ) * Adding target reward * Adding target successs * Addressing comments * Using custom_reward_threshold and target_success_rate * Adding exit message * Moving success rate to environment * Making target_success_rate optional	2018-11-12 15:03:43 -08:00
Itai Caspi	0fe583186e	fixing the coach entrypoint after adding the CoachLauncher abstraction (#92 )	2018-11-12 10:26:49 -08:00
Leo Dirac	2804a7c24f	Refactor launcher to be object-oriented (#63 ) * Import of annoy library uses failed_import mechanism.	2018-11-10 22:10:19 +02:00
Itai Caspi	3fd433ffab	fix ddpg head (#78 )	2018-11-09 08:17:04 -08:00
Itai Caspi	3a0a1159e9	fixing the dropout rate code (#72 ) addresses issue #53	2018-11-08 16:53:47 +02:00
Itai Caspi	389c65cbbe	fix for a bug in distributed training that was introduced lately (#75 )	2018-11-08 16:52:48 +02:00
Itai Caspi	83e0b09a6a	adding the missing export_onnx_graph parameter to task parameters (#73 )	2018-11-08 12:52:42 +02:00
Leo Dirac	8f0415b4cc	Tweak additional_simulator_parameters for easier configuration and better error logging. (#69 )	2018-11-07 11:01:12 -08:00
Gal Leibovich	49dea39d34	N-step returns for rainbow (#67 ) * n_step returns for rainbow * Rename CartPole_PPO -> CartPole_ClippedPPO	2018-11-07 18:33:08 +02:00
Itai Caspi	35c477c922	allowing grayscale observations in gym (#66 ) * allowing grayscale observations in gym	2018-11-07 17:08:10 +02:00
Sina Afrooze	5fadb9c18e	Adding mxnet components to rl_coach/architectures (#60 ) Adding mxnet components to rl_coach architectures. - Supports PPO and DQN - Tested with CartPole_PPO and CarPole_DQN - Normalizing filters don't work right now (see #49) and are disabled in CartPole_PPO preset - Checkpointing is disabled for MXNet	2018-11-07 17:07:15 +02:00
Itai Caspi	e7a91b4dc3	Fix cmd line arguments handling (#68 ) * refactoring the merging of the task parameters and the command line parameters * removing some unused command line arguments * fix for saving checkpoints when not passing through coach.py	2018-11-07 15:47:02 +02:00
Sina Afrooze	93571306c3	Removed tensorflow specific code in presets (#59 ) * Add generic layer specification for using in presets * Modify presets to use the generic scheme	2018-11-06 17:39:29 +02:00
Itai Caspi	811152126c	Export graph to ONNX (#61 ) Implements the ONNX graph exporting feature. Currently does not work for NAF, C51 and A3C_LSTM due to unsupported TF layers in the tf2onnx library.	2018-11-06 10:55:21 +02:00
Leo Dirac	d75df17d97	Modifying ScreenLogger to optionally not output color codes (#56 ) * Modifying ScreenLogger to not output color when configured by new CLI parameter	2018-11-05 15:25:49 -08:00
Balaji Subramaniam	7e7006305a	Integrate coach.py params with distributed Coach. (#42 ) * Integrate coach.py params with distributed Coach. * Minor improvements - Use enums instead of constants. - Reduce code duplication. - Ask experiment name with timeout.	2018-11-05 09:33:30 -08:00
Sina Afrooze	95b4fc6888	Added ability to switch between tensorflow and mxnet using -f commandline argument. (#48 ) NOTE: tensorflow framework works fine if mxnet is not installed in env, but mxnet will not work if tensorflow is not installed because of the code in network_wrapper.	2018-10-30 15:29:34 -07:00
Sina Afrooze	2046358ab0	Add docstring for architecture (#47 ) - Removed get_model() from architecture because it is only implementation detail of architecture.	2018-10-30 11:02:37 +02:00
Thom Lane	324c67d614	Bug fix: Removed reference to args which is out of scope. Conditioning now performed one level above. (#54 )	2018-10-29 22:29:22 -07:00
Sina Afrooze	a888226641	Move embedder, middleware, and head parameters to framework agnostic modules. (#45 ) Part of #28	2018-10-29 14:46:40 -07:00
Ajay Deshpande	16b3e99f37	Setup basic CI flow (#38 ) Adds automated running of unit, integration tests (and optionally longer running tests)	2018-10-24 18:27:58 -07:00
Zach Dwiel	2cc6abc3c4	update CartPole_PPO not addressed during rebase (#41 )	2018-10-24 16:58:25 -07:00
zach dwiel	f835ac902c	fix renaming: save_checkpoint_sec -> checkpoint_save_secs	2018-10-24 10:52:18 -04:00
Ajay Deshpande	78cf25c09a	Removing mjkey, should be injected from env var	2018-10-23 19:59:02 -04:00
Ajay Deshpande	fb2721fffa	Removing comments	2018-10-23 19:59:02 -04:00
Ajay Deshpande	9a30c26469	Adding improvements	2018-10-23 19:59:02 -04:00
zach dwiel	3ba0df7d07	update GraphManager.act specified return type	2018-10-23 19:58:17 -04:00
zach dwiel	def76b4cc6	update CartPole_PPO	2018-10-23 19:58:17 -04:00
zach dwiel	3e5e5475de	update training worker	2018-10-23 19:58:17 -04:00

1 2 3 4 5 ...

307 Commits