coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2026-03-18 07:43:47 +01:00

Author	SHA1	Message	Date
Sina Afrooze	7d25477942	Add observation_space_type to GymEnvironmentParameters so that it is possible to explicitly state that in presets. (#145 )	2018-11-25 07:11:48 +02:00
Balaji Subramaniam	bf2036b284	S3 optimization - save only the latest checkpoint. (#148 )	2018-11-23 22:17:36 -08:00
Balaji Subramaniam	13d2679af4	Sync experiment dir, videos, gifs to S3. (#147 )	2018-11-23 20:52:12 -08:00
Sina Afrooze	5332013bd1	Implement frame-work agnostic rollout and training workers (#137 ) * Added checkpoint state file to coach checkpointing. * Removed TF specific code from rollout_worker, training_worker, and s3_data_store	2018-11-23 18:05:44 -08:00
Ajay Deshpande	4a6c404070	Adding worker logs and plumbed task_parameters to distributed coach (#130 )	2018-11-23 15:35:11 -08:00
Gal Leibovich	2b4c9c6774	Removing grarph_manager param (#141 )	2018-11-23 11:42:54 -08:00
Gal Leibovich	a1c56edd98	Fixes for having NumpySharedRunningStats syncing on multi-node (#139 ) 1. Having the standard checkpoint prefix in order for the data store to grab it, and sync it to S3. 2. Removing the reference to Redis so that it won't try to pickle that in. 3. Enable restoring a checkpoint into a single-worker run, which was saved by a single-node-multiple-worker run.	2018-11-23 16:11:47 +02:00
Sina Afrooze	87a7848b0a	Moved tf.variable_scope and tf.device calls to framework-specific architecture (#136 )	2018-11-22 22:52:21 +02:00
shadiendrawis	559969d3dd	disabled loading for target weights (#138 ) * Update savers.py * disabled loading for target weights	2018-11-22 18:15:52 +02:00
Thom Lane	949d91321a	Added explicit environment closing (#129 )	2018-11-22 14:25:03 +02:00
Sina Afrooze	16cdd9a9c1	Tf checkpointing using saver mechanism (#134 )	2018-11-22 14:08:10 +02:00
Cody Hsieh	dd18959e53	Don't download when checkpoint files are already present (#109 ) * add check if checkpoint file present	2018-11-21 15:32:53 -08:00
shadiendrawis	b94239234a	Removed TF warning when training in a distributed setting (#133 ) * removed TF warning when training in a distributed setting and changed package version * revert version back to 0.11.0	2018-11-21 16:09:04 +02:00
Gal Leibovich	a112ee69f6	Save filters' internal state (#127 ) * save filters internal state * moving the restore to be made from within NumpyRunningStats	2018-11-20 17:21:48 +02:00
Sina Afrooze	67eb9e4c28	Adding checkpointing framework (#74 ) * Adding checkpointing framework as well as mxnet checkpointing implementation. - MXNet checkpoint for each network is saved in a separate file. * Adding checkpoint restore for mxnet to graph-manager * Add unit-test for get_checkpoint_state() * Added match.group() to fix unit-test failing on CI * Added ONNX export support for MXNet	2018-11-19 19:45:49 +02:00
x77a1	4da56b1ff2	Enable setting the data store factory in Graph manager (#110 ) * Enable setting the data store factory in Graph manager This change enables us to use custom data store for storing and retrieving models. We currently need this to have use a data store that loads temporary AWS credentials from disk before calling store or load operations. * Removed data store factory and introduced data store as a attribute	2018-11-19 08:35:03 -08:00
Sina Afrooze	67a90ee87e	Add tensor input type for arbitrary dimensional observation (#125 ) * Allow arbitrary dimensional observation (non vector or image) * Added creating PlanarMapsObservationSpace to GymEnvironment when number of channels is not 1 or 3	2018-11-19 16:41:12 +02:00
Thom Lane	7ba1a4393f	Channel order transpose, for image embedder. Updated unit test. (#87 )	2018-11-19 15:39:03 +02:00
shadiendrawis	ff816b347d	aws pip package (#118 ) Added support for a rl-coach-slim package.	2018-11-19 14:00:16 +02:00
Gal Novik	3817cefb12	removing box2d and atari requirements (#124 )	2018-11-19 13:42:08 +02:00
Thom Lane	9210909050	Added MXNet to arg docs. (#121 )	2018-11-19 11:31:28 +02:00
Gal Leibovich	d4d06aaea6	remove kubernetes dependency (#117 )	2018-11-18 18:10:22 +02:00
Gal Leibovich	430e286c56	muting pygame's hello message (#116 )	2018-11-18 18:02:55 +02:00
Gal Leibovich	ce85c8e8c3	Removing Egreedy from CartPole_ClippedPPO. ClippedPPO's default exploration policy is to be used instead. (#115 )	2018-11-18 16:36:34 +02:00
Gal Leibovich	6caf721d1c	Numpy shared running stats (#97 )	2018-11-18 14:46:40 +02:00
Gal Novik	e1fa6e9681	roboschool: updating envs to v1, fixing rendering (#112 )	2018-11-18 13:38:10 +02:00
Gal Leibovich	9fd4d55623	Making stop condition optional by using a flag (#113 ) * apply stop condition flag (default: ignore the stop condition)	2018-11-18 13:37:39 +02:00
Gal Leibovich	449bcfb4e1	summing head losses instead of taking the mean (#98 )	2018-11-18 12:20:00 +02:00
Zach Dwiel	5b11fa5656	check for local mujoco key in build process (#105 ) approved by scott.	2018-11-18 10:57:30 +02:00
Balaji Subramaniam	dea1826658	Re-enable NFS data store. (#101 )	2018-11-16 13:55:33 -08:00
Thom Lane	a0f25034c3	Added average total reward to logging after evaluation phase completes. (#93 )	2018-11-16 08:22:00 -08:00
Thom Lane	81bac050d7	Added Custom Initialisation for MXNet Heads (#86 ) * Added NormalizedRSSInitializer, using same method as TensorFlow backend, but changed name since ‘columns’ have different meaning in dense layer weight matrix in MXNet. * Added unit test for NormalizedRSSInitializer.	2018-11-16 08:15:43 -08:00
Balaji Subramaniam	101c55d37d	Handle both Environment Steps and Episodes on the subscriber side. (#99 )	2018-11-15 14:42:21 -08:00
Thom Lane	3358e04a6a	Corrected MXNet's PPO Head for Continuous Action Spaces (#84 ) * Changes required for Continuous PPO Head with MXNet. Used in MountainCarContinuous_ClippedPPO. * Simplified changes for continuous ppo. * Cleaned up to avoid duplicate code, and simplified covariance creation.	2018-11-15 13:27:54 -08:00
Ajay Deshpande	fde73ced13	Simulating the act on the trainer. (#65 ) * Remove the use of daemon threads for Redis subscribe. * Emulate act and observe on trainer side to update internal vars.	2018-11-15 08:38:58 -08:00
Scott Leishman	fe6857eabd	broaden supported package versions (#50 ) * broaden supported package versions. * fix mxnet variants. Also back-out tuple deprecation change introduced in prior commit. * correct CI image deployment on master branch merge.	2018-11-15 15:29:49 +02:00
Itai Caspi	6d40ad1650	update of api docstrings across coach and tutorials [WIP] (#91 ) * updating the documentation website * adding the built docs * update of api docstrings across coach and tutorials 0-2 * added some missing api documentation * New Sphinx based documentation	2018-11-15 15:00:13 +02:00
Scott Leishman	524f8436a2	create per environment Dockerfiles. (#70 ) * create per environment Dockerfiles. Adjust CI setup to better parallelize runs. Fix a couple of issues in golden and trace tests. Update a few of the docs. * bugfix in mmc agent. Also install kubectl for CI, update badge branch. * remove integration test parallelism.	2018-11-14 07:40:22 -08:00
Balaji Subramaniam	a849c17e46	Enable distributed SharedRunningStats (#81 ) - Use Redis pub/sub for updating SharedRunningStats.	2018-11-13 19:17:38 +02:00
Ajay Deshpande	875d6ef017	Adding target reward and target sucess (#58 ) * Adding target reward * Adding target successs * Addressing comments * Using custom_reward_threshold and target_success_rate * Adding exit message * Moving success rate to environment * Making target_success_rate optional	2018-11-12 15:03:43 -08:00
Itai Caspi	0fe583186e	fixing the coach entrypoint after adding the CoachLauncher abstraction (#92 )	2018-11-12 10:26:49 -08:00
Leo Dirac	2804a7c24f	Refactor launcher to be object-oriented (#63 ) * Import of annoy library uses failed_import mechanism.	2018-11-10 22:10:19 +02:00
Itai Caspi	3fd433ffab	fix ddpg head (#78 )	2018-11-09 08:17:04 -08:00
Itai Caspi	3a0a1159e9	fixing the dropout rate code (#72 ) addresses issue #53	2018-11-08 16:53:47 +02:00
Itai Caspi	389c65cbbe	fix for a bug in distributed training that was introduced lately (#75 )	2018-11-08 16:52:48 +02:00
Itai Caspi	83e0b09a6a	adding the missing export_onnx_graph parameter to task parameters (#73 )	2018-11-08 12:52:42 +02:00
Leo Dirac	8f0415b4cc	Tweak additional_simulator_parameters for easier configuration and better error logging. (#69 )	2018-11-07 11:01:12 -08:00
Gal Leibovich	49dea39d34	N-step returns for rainbow (#67 ) * n_step returns for rainbow * Rename CartPole_PPO -> CartPole_ClippedPPO	2018-11-07 18:33:08 +02:00
Itai Caspi	35c477c922	allowing grayscale observations in gym (#66 ) * allowing grayscale observations in gym	2018-11-07 17:08:10 +02:00
Sina Afrooze	5fadb9c18e	Adding mxnet components to rl_coach/architectures (#60 ) Adding mxnet components to rl_coach architectures. - Supports PPO and DQN - Tested with CartPole_PPO and CarPole_DQN - Normalizing filters don't work right now (see #49) and are disabled in CartPole_PPO preset - Checkpointing is disabled for MXNet	2018-11-07 17:07:15 +02:00

1 2 3 4 5 ...

425 Commits