coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2026-03-18 15:53:35 +01:00

Author	SHA1	Message	Date
Gal Leibovich	c1d1fae342	Distiller's AMC induced changes (#359 ) * override episode rewards with the last transition reward * EWMA normalization filter * allowing control over when the pre_network filter runs	2019-08-05 10:24:58 +03:00
Scott Leishman	7df67dafa3	update to point at new CI cluster. (#385 )	2019-08-04 13:55:04 +03:00
Gal Novik	2697142d5a	Release 1.0.0 (#382 ) * Updating README * Shortening test cycles	2019-07-24 16:10:58 +03:00
Gal Leibovich	718597ce9a	Fixes to Batch RL tutorial (#378 )	2019-07-16 11:22:42 +03:00
Gal Novik	0a4cc7e081	Additional cmd line examples (#377 ) Adding command line examples to the Quick Start Guide tutorial	2019-07-15 12:32:59 +03:00
Gal Leibovich	19ad2d60a7	Batch RL Tutorial (#372 )	2019-07-14 18:43:48 +03:00
Gal Novik	b82414138d	Workaround the OSError due to bad address failure on the CI runs (#370 ) workaround the OSError due to bad address failure on the CI runs	2019-07-07 17:11:19 +03:00
Gal Leibovich	587b74e04a	Remove double call to reset_internal_state() on gym environments (#364 )	2019-07-02 13:43:23 +03:00
anabwan	a576ab5659	tests: Removed mxnet from functional tests + minor fix on rewards (#362 ) * ci: change workflow * changed timeout * fix function reach reward * print logs * removing mxnet * res'	2019-06-27 18:52:29 +03:00
anabwan	30c64d0656	using gym=0.12.5 instead of latest (#360 ) * using gym=0.12.5 instead of latest * changing docker gym version * changing dockingfile gym version	2019-06-24 10:34:28 +03:00
Gal Leibovich	d6795bd524	batchnorm fixes + disabling batchnorm in DDPG (#353 ) Co-authored-by: James Casbon <casbon+gh@gmail.com>	2019-06-23 11:28:22 +03:00
anabwan	7b5d6a3f03	tests: stabling functional tests (#355 ) * tests: stabling functional tests * functional removed	2019-06-20 15:30:47 +03:00
shadiendrawis	8e812ef82f	Coach as a library (#348 ) * CoachInterface + tutorial * Some improvements and typo fixes * merge tutorial 0 and 4 * typo fix + additional tutorial changes * tutorial changes * added reading signals and experiment path argument	2019-06-19 18:05:03 +03:00
anabwan	1c90bc22a1	ci: using serial jobs in nightly (#350 )	2019-06-17 10:53:36 +03:00
Gal Leibovich	7eb884c5b2	TD3 (#338 )	2019-06-16 11:11:21 +03:00
Timo Kaufmann	8df3c46756	Do not hardcode path to bash (#332 )	2019-06-10 20:10:28 +03:00
Gal Leibovich	a1bb8eef89	DDPG Critic Head Bug Fix (#344 ) * A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean	2019-06-05 17:47:56 +03:00
anabwan	0aa5359d63	tests: added assert for cp param and changing test args order (#342 )	2019-06-05 00:16:50 +03:00
Gal Novik	e49aac05aa	Update README.md (#341 ) Adding some links to the tutorials from the README	2019-06-04 11:35:34 +03:00
anabwan	f6d5e60eff	Added build base for nightly (#340 ) * Added build base for nightly * fix requires * remove commetted code	2019-06-03 23:04:34 +03:00
Gal Novik	6e7e7f6d3d	Update setup.py to 0.12.1 (#337 )	2019-05-30 10:13:36 +03:00
anabwan	23df868d32	Removed unnecessary futures dependency (#336 )	2019-05-29 14:34:48 +03:00
Gal Leibovich	4c996e147e	applying filters for a csv loaded dataset + some bug-fixes in data loading (#319 )	2019-05-28 15:44:55 +03:00
anabwan	6319387357	increase timeout for golden tests (#335 )	2019-05-28 14:19:11 +03:00
anabwan	f5ba14575c	tests: print logs on failure + fix -cp param (#327 ) * tests: pring logs on failure * fix import * added job to circleci * fix functional * removed debug job	2019-05-28 13:45:43 +03:00
Gal Leibovich	251dc9ccc0	Preset dependent number of csv read attempts in golden testing (#334 )	2019-05-28 12:19:57 +03:00
anabwan	ddffac8570	fixed release version (#333 ) * fixed release version * update docs	2019-05-28 11:11:15 +03:00
Gal Leibovich	9e9c4fd332	Create a dataset using an agent (#306 ) Generate a dataset using an agent (allowing to select between this and a random dataset)	2019-05-28 09:34:49 +03:00
anabwan	342b7184bc	Enabling Coach Documentation to be run even when environments are not installed (#326 )	2019-05-27 10:46:07 +03:00
James Casbon	2b7d536da4	Add head regularization costs to tf.losses (#292 )	2019-05-26 17:15:42 +03:00
anabwan	3b6e413532	tests: fix traces and changing workflow jobs (#316 ) * tests: fix traces export presets * tests: increase time for traces * tests * remove approval * fix approval * fix ap * change worflow jobs * fix path * fix repo path * change run traces * adding assert * fix assert	2019-05-26 15:27:36 +03:00
anabwan	b567091d2e	removed timestep_limit due to gym version upgrade (#325 ) * removed timestep_limit due to gym version update * removed _past_limit wrapper	2019-05-26 13:58:16 +03:00
Gal Leibovich	30c2b2fc45	moving to skimage.transform.resize (#321 )	2019-05-23 13:38:01 +03:00
Gal Leibovich	acceb03ac0	bug fixes for OPE (#311 )	2019-05-21 16:39:11 +03:00
anabwan	85d70dd7d5	tests: fix traces export presets (#315 )	2019-05-13 15:32:30 +03:00
anabwan	f78bbbdbd1	tests: weekly deployment (#304 ) * tests: weekly deployment * running golden_tests * running all traces * run time: Friday @ 04:00AM	2019-05-13 14:51:38 +03:00
Gal Leibovich	deb0251367	bug fix following PR #191 (#313 )	2019-05-12 13:42:45 -07:00
Gal Novik	aa9f3cefaf	Printing input size as part of network summary (#310 )	2019-05-12 15:40:02 +03:00
anabwan	ffb55b4142	tests: update traces (#302 ) * Traces folder removed from repo and moved to S3 * Traces jobs and update will use directly the S3 files	2019-05-07 10:04:05 +03:00
anabwan	740359587d	tests: fixed nightly (#301 ) * tests: fixed nightly * tests: temp testing functional tests * tests: temp testing functional tests * tests: add seed to -cp * test: last fix	2019-05-05 08:28:57 +03:00
Gal Leibovich	582921ffe3	OPE: Weighted Importance Sampling (#299 )	2019-05-02 19:25:42 +03:00
guyk1971	74db141d5e	SAC algorithm (#282 ) * SAC algorithm * SAC - updates to agent (learn_from_batch), sac_head and sac_q_head to fix problem in gradient calculation. Now SAC agents is able to train. gym_environment - fixing an error in access to gym.spaces * Soft Actor Critic - code cleanup * code cleanup * V-head initialization fix * SAC benchmarks * SAC Documentation * typo fix * documentation fixes * documentation and version update * README typo	2019-05-01 18:37:49 +03:00
Ajay Deshpande	33dc29ee99	Uploading checkpoint if crd provided (#191 ) * Uploading checkpoint if crd provided * Changing the calculation of total steps because of a recent change in core_types Fixes #195	2019-04-26 12:27:33 -07:00
anabwan	b3db9ce77d	tests: fixed failed tests - stabling CI (#298 ) * tests: stabling CI * tests: fix failed tests - stabling CI * fix get csv files. - fixed seed test * fix clres on conftest - now can modify paths during test run. - this fixed the mxnet checkpoint test * tests: fix comments	2019-04-23 15:12:11 +03:00
Gal Leibovich	9f625c197b	fix for fetch rendering (#297 ) * fix for fetch rendering - removing code which was once required with older gym versions. images are now rendered correctly by default with the latest gym. * fixing mujoco camera id failure	2019-04-21 17:37:14 +03:00
anabwan	f14915cada	tests: removed Starcraft from CI (#296 ) * tests: removed Starcraft from CI * tests: fix comment * tests: fix mujoco	2019-04-21 13:51:14 +03:00
Gal Leibovich	4741b0b916	BCQ variant on top of DDQN (#276 ) * kNN based model for predicting which actions to drop * fix for seeds with batch rl	2019-04-16 17:06:23 +03:00
Federico Andres Lois	bdb9b224a8	Include missing RegressionHead. (#263 )	2019-04-16 15:24:06 +03:00
anabwan	20a8dea0dd	tests: minor fix for functional tests (#289 ) * tests: minor fix for functional tests * tests: fix value	2019-04-15 12:28:23 +03:00
zach dwiel	88f9c926ab	update comment describing why the output filters don't modify Agent.last_action_info	2019-04-09 12:14:27 -04:00

1 2 3 4 5 ...

479 Commits