coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2026-03-19 08:23:33 +01:00

Author	SHA1	Message	Date
Guy Jacob	f52ff1784d	Fix breaking change from minio update (#469 ) `ResponseError` replaced by `S3Error` in new minio version	2020-12-15 10:02:16 +02:00
Guy Jacob	103d4477eb	Disable NumPy and TF2 related warnings (#463 )	2020-09-24 15:11:45 +03:00
Gal Novik	c9738280fd	Require Python 3.6 + Changes to CI configuration (#452 ) * Change build__env jobs to pull base image of current "tag" instead of "master" image Change nightly flow so build__env jobs now gated by build_base (so change in previous bullet works in nightly) Bugfix in CheckpointDataStore: Call to object.__init__ with parameters * Disabling unstable Doom A3C and ACER golden tests	2020-07-26 16:11:22 +03:00
Gal Novik	79b05a8105	Wolpertinger preset failure fix (#434 ) Numpy 1.18 fails to cast float to int as part of the wolpertinger preset run	2020-01-14 16:26:38 +02:00
shadiendrawis	188b86369a	fix e-greedy in case action values were equal (#423 )	2019-11-10 17:20:44 +02:00
shadiendrawis	6ca91b9090	add reset internal state to rollout worker (#421 )	2019-11-03 14:42:51 +02:00
Gal Leibovich	66fada7f78	Remove assertion from BatchRLGraphManager	2019-10-22 11:54:14 +03:00
shadiendrawis	5ad5a58350	fix atari stack overflow (#412 )	2019-10-06 18:14:21 +03:00
shadiendrawis	0a712ecc94	Fix numpy shared running stats to support images (#411 )	2019-10-06 12:16:38 +03:00
Gal Leibovich	79a4161eca	Workaround for dumping gifs through the Python API (#405 )	2019-09-26 12:21:25 +03:00
Gal Leibovich	c7949d7011	Fix Atari Schedule Heatup	2019-09-08 16:57:38 +03:00
Gal Leibovich	138ced23ba	RL in Large Discrete Action Spaces - Wolpertinger Agent (#394 ) * Currently this is specific to the case of discretizing a continuous action space. Can easily be adapted to other case by feeding the kNN otherwise, and removing the usage of a discretizing output action filter	2019-09-08 12:53:49 +03:00
Zach Dwiel	7b0fccb041	Add RedisDataStore (#295 ) * GraphManager.set_session also sets self.sess * make sure that GraphManager.fetch_from_worker uses training phase * remove unnecessary phase setting in training worker * reorganize rollout worker * provide default name to GlobalVariableSaver.__init__ since it isn't really used anyway * allow dividing TrainingSteps and EnvironmentSteps * add timestamps to the log * added redis data store * conflict merge fix	2019-08-28 21:15:58 +03:00
Gal Leibovich	c1d1fae342	Distiller's AMC induced changes (#359 ) * override episode rewards with the last transition reward * EWMA normalization filter * allowing control over when the pre_network filter runs	2019-08-05 10:24:58 +03:00
Gal Novik	2697142d5a	Release 1.0.0 (#382 ) * Updating README * Shortening test cycles	2019-07-24 16:10:58 +03:00
Gal Leibovich	19ad2d60a7	Batch RL Tutorial (#372 )	2019-07-14 18:43:48 +03:00
Gal Novik	b82414138d	Workaround the OSError due to bad address failure on the CI runs (#370 ) workaround the OSError due to bad address failure on the CI runs	2019-07-07 17:11:19 +03:00
Gal Leibovich	587b74e04a	Remove double call to reset_internal_state() on gym environments (#364 )	2019-07-02 13:43:23 +03:00
anabwan	a576ab5659	tests: Removed mxnet from functional tests + minor fix on rewards (#362 ) * ci: change workflow * changed timeout * fix function reach reward * print logs * removing mxnet * res'	2019-06-27 18:52:29 +03:00
Gal Leibovich	d6795bd524	batchnorm fixes + disabling batchnorm in DDPG (#353 ) Co-authored-by: James Casbon <casbon+gh@gmail.com>	2019-06-23 11:28:22 +03:00
anabwan	7b5d6a3f03	tests: stabling functional tests (#355 ) * tests: stabling functional tests * functional removed	2019-06-20 15:30:47 +03:00
shadiendrawis	8e812ef82f	Coach as a library (#348 ) * CoachInterface + tutorial * Some improvements and typo fixes * merge tutorial 0 and 4 * typo fix + additional tutorial changes * tutorial changes * added reading signals and experiment path argument	2019-06-19 18:05:03 +03:00
Gal Leibovich	7eb884c5b2	TD3 (#338 )	2019-06-16 11:11:21 +03:00
Timo Kaufmann	8df3c46756	Do not hardcode path to bash (#332 )	2019-06-10 20:10:28 +03:00
Gal Leibovich	a1bb8eef89	DDPG Critic Head Bug Fix (#344 ) * A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean	2019-06-05 17:47:56 +03:00
anabwan	0aa5359d63	tests: added assert for cp param and changing test args order (#342 )	2019-06-05 00:16:50 +03:00
Gal Leibovich	4c996e147e	applying filters for a csv loaded dataset + some bug-fixes in data loading (#319 )	2019-05-28 15:44:55 +03:00
anabwan	f5ba14575c	tests: print logs on failure + fix -cp param (#327 ) * tests: pring logs on failure * fix import * added job to circleci * fix functional * removed debug job	2019-05-28 13:45:43 +03:00
Gal Leibovich	251dc9ccc0	Preset dependent number of csv read attempts in golden testing (#334 )	2019-05-28 12:19:57 +03:00
Gal Leibovich	9e9c4fd332	Create a dataset using an agent (#306 ) Generate a dataset using an agent (allowing to select between this and a random dataset)	2019-05-28 09:34:49 +03:00
anabwan	342b7184bc	Enabling Coach Documentation to be run even when environments are not installed (#326 )	2019-05-27 10:46:07 +03:00
James Casbon	2b7d536da4	Add head regularization costs to tf.losses (#292 )	2019-05-26 17:15:42 +03:00
anabwan	3b6e413532	tests: fix traces and changing workflow jobs (#316 ) * tests: fix traces export presets * tests: increase time for traces * tests * remove approval * fix approval * fix ap * change worflow jobs * fix path * fix repo path * change run traces * adding assert * fix assert	2019-05-26 15:27:36 +03:00
anabwan	b567091d2e	removed timestep_limit due to gym version upgrade (#325 ) * removed timestep_limit due to gym version update * removed _past_limit wrapper	2019-05-26 13:58:16 +03:00
Gal Leibovich	30c2b2fc45	moving to skimage.transform.resize (#321 )	2019-05-23 13:38:01 +03:00
Gal Leibovich	acceb03ac0	bug fixes for OPE (#311 )	2019-05-21 16:39:11 +03:00
Gal Leibovich	deb0251367	bug fix following PR #191 (#313 )	2019-05-12 13:42:45 -07:00
Gal Novik	aa9f3cefaf	Printing input size as part of network summary (#310 )	2019-05-12 15:40:02 +03:00
anabwan	ffb55b4142	tests: update traces (#302 ) * Traces folder removed from repo and moved to S3 * Traces jobs and update will use directly the S3 files	2019-05-07 10:04:05 +03:00
anabwan	740359587d	tests: fixed nightly (#301 ) * tests: fixed nightly * tests: temp testing functional tests * tests: temp testing functional tests * tests: add seed to -cp * test: last fix	2019-05-05 08:28:57 +03:00
Gal Leibovich	582921ffe3	OPE: Weighted Importance Sampling (#299 )	2019-05-02 19:25:42 +03:00
guyk1971	74db141d5e	SAC algorithm (#282 ) * SAC algorithm * SAC - updates to agent (learn_from_batch), sac_head and sac_q_head to fix problem in gradient calculation. Now SAC agents is able to train. gym_environment - fixing an error in access to gym.spaces * Soft Actor Critic - code cleanup * code cleanup * V-head initialization fix * SAC benchmarks * SAC Documentation * typo fix * documentation fixes * documentation and version update * README typo	2019-05-01 18:37:49 +03:00
Ajay Deshpande	33dc29ee99	Uploading checkpoint if crd provided (#191 ) * Uploading checkpoint if crd provided * Changing the calculation of total steps because of a recent change in core_types Fixes #195	2019-04-26 12:27:33 -07:00
anabwan	b3db9ce77d	tests: fixed failed tests - stabling CI (#298 ) * tests: stabling CI * tests: fix failed tests - stabling CI * fix get csv files. - fixed seed test * fix clres on conftest - now can modify paths during test run. - this fixed the mxnet checkpoint test * tests: fix comments	2019-04-23 15:12:11 +03:00
Gal Leibovich	9f625c197b	fix for fetch rendering (#297 ) * fix for fetch rendering - removing code which was once required with older gym versions. images are now rendered correctly by default with the latest gym. * fixing mujoco camera id failure	2019-04-21 17:37:14 +03:00
Gal Leibovich	4741b0b916	BCQ variant on top of DDQN (#276 ) * kNN based model for predicting which actions to drop * fix for seeds with batch rl	2019-04-16 17:06:23 +03:00
Federico Andres Lois	bdb9b224a8	Include missing RegressionHead. (#263 )	2019-04-16 15:24:06 +03:00
anabwan	20a8dea0dd	tests: minor fix for functional tests (#289 ) * tests: minor fix for functional tests * tests: fix value	2019-04-15 12:28:23 +03:00
zach dwiel	88f9c926ab	update comment describing why the output filters don't modify Agent.last_action_info	2019-04-09 12:14:27 -04:00
zach dwiel	fd2c210915	rename AgentInterface.emulate_observe_on_trainer or observe_transition and call from AgentInterface.observe	2019-04-09 12:14:27 -04:00

1 2 3 4 5 ...

294 Commits