coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00

Author	SHA1	Message	Date
Gal Leibovich	251dc9ccc0	Preset dependent number of csv read attempts in golden testing (#334 )	2019-05-28 12:19:57 +03:00
zach dwiel	7d79433c05	remove unused parameter scale_external_reward_by_intrinsic_reward_value	2019-04-09 12:14:27 -04:00
Gal Leibovich	e3c7e526c7	Batch RL (#238 )	2019-03-19 18:07:09 +02:00
Gal Leibovich	d6158a5cfc	restoring from a checkpoint file (#247 )	2019-03-17 16:28:09 +02:00
Gal Novik	10220be9be	Adding support for evaluation only mode with predefined number of steps (#225 )	2019-03-03 10:03:45 +02:00
Gal Leibovich	f9ee526536	Fix for issue #128 - circular DQN import (#130 )	2018-12-16 16:06:44 +02:00
Sina Afrooze	87a7848b0a	Moved tf.variable_scope and tf.device calls to framework-specific architecture (#136 )	2018-11-22 22:52:21 +02:00
Gal Leibovich	a112ee69f6	Save filters' internal state (#127 ) * save filters internal state * moving the restore to be made from within NumpyRunningStats	2018-11-20 17:21:48 +02:00
Gal Leibovich	d4d06aaea6	remove kubernetes dependency (#117 )	2018-11-18 18:10:22 +02:00
Gal Leibovich	9fd4d55623	Making stop condition optional by using a flag (#113 ) * apply stop condition flag (default: ignore the stop condition)	2018-11-18 13:37:39 +02:00
Ajay Deshpande	fde73ced13	Simulating the act on the trainer. (#65 ) * Remove the use of daemon threads for Redis subscribe. * Emulate act and observe on trainer side to update internal vars.	2018-11-15 08:38:58 -08:00
Itai Caspi	6d40ad1650	update of api docstrings across coach and tutorials [WIP] (#91 ) * updating the documentation website * adding the built docs * update of api docstrings across coach and tutorials 0-2 * added some missing api documentation * New Sphinx based documentation	2018-11-15 15:00:13 +02:00
Ajay Deshpande	875d6ef017	Adding target reward and target sucess (#58 ) * Adding target reward * Adding target successs * Addressing comments * Using custom_reward_threshold and target_success_rate * Adding exit message * Moving success rate to environment * Making target_success_rate optional	2018-11-12 15:03:43 -08:00
Itai Caspi	83e0b09a6a	adding the missing export_onnx_graph parameter to task parameters (#73 )	2018-11-08 12:52:42 +02:00
Gal Leibovich	49dea39d34	N-step returns for rainbow (#67 ) * n_step returns for rainbow * Rename CartPole_PPO -> CartPole_ClippedPPO	2018-11-07 18:33:08 +02:00
Itai Caspi	e7a91b4dc3	Fix cmd line arguments handling (#68 ) * refactoring the merging of the task parameters and the command line parameters * removing some unused command line arguments * fix for saving checkpoints when not passing through coach.py	2018-11-07 15:47:02 +02:00
Balaji Subramaniam	7e7006305a	Integrate coach.py params with distributed Coach. (#42 ) * Integrate coach.py params with distributed Coach. * Minor improvements - Use enums instead of constants. - Reduce code duplication. - Ask experiment name with timeout.	2018-11-05 09:33:30 -08:00
Sina Afrooze	95b4fc6888	Added ability to switch between tensorflow and mxnet using -f commandline argument. (#48 ) NOTE: tensorflow framework works fine if mxnet is not installed in env, but mxnet will not work if tensorflow is not installed because of the code in network_wrapper.	2018-10-30 15:29:34 -07:00
zach dwiel	f835ac902c	fix renaming: save_checkpoint_sec -> checkpoint_save_secs	2018-10-24 10:52:18 -04:00
Zach Dwiel	61ed6b8ce4	add better defaults to TaskParameters	2018-10-23 16:40:33 -04:00
Gal Leibovich	5a8da90d32	bug-fix for dumping movies (+ small refactoring and rename 'VideoDumpMethod -> 'VideoDumpFilter')	2018-10-21 17:29:10 +03:00
Shadi Endrawis	51726a5b80	network_imporvements branch merge	2018-10-02 13:43:36 +03:00
itaicaspi-intel	a16d724963	removing some of the presets from the trace tests + more robust replay buffer loading	2018-09-12 15:26:16 +03:00
Itai Caspi	72a1d9d426	Itaicaspi/episode reset refactoring (#105 ) * reordering of the episode reset operation and allowing to store episodes only when they are terminated * reordering of the episode reset operation and allowing to store episodes only when they are terminated * revert tensorflow-gpu to 1.9.0 + bug fix in should_train() * tests readme file and refactoring of policy optimization agent train function * Update README.md * Update README.md * additional policy optimization train function simplifications * Updated the traces after the reordering of the environment reset * docker and jenkins files * updated the traces to the ones from within the docker container * updated traces and added control suite to the docker * updated jenkins file with the intel proxy + updated doom basic a3c test params * updated line breaks in jenkins file * added a missing line break in jenkins file * refining trace tests ignored presets + adding a configurable beta entropy value * switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue * updated benchmarks for dueling ddqn breakout and pong * allowing dynamic updates to the loss weights + bug fix in episode.update_returns * remove docker and jenkins file	2018-09-04 15:07:54 +03:00
Gal Leibovich	1aa2ab0590	parameter noise exploration - using Noisy Nets	2018-08-27 18:19:01 +03:00
Gal Novik	19ca5c24b1	pre-release 0.10.0	2018-08-13 17:11:34 +03:00

26 Commits