coach

gryf/coach

mirror of https://github.com/gryf/coach.git synced 2026-07-08 02:16:32 +02:00

Author	SHA1	Message	Date
Gal Leibovich	d6795bd524	batchnorm fixes + disabling batchnorm in DDPG (#353 ) Co-authored-by: James Casbon <casbon+gh@gmail.com>	2019-06-23 11:28:22 +03:00
anabwan	7b5d6a3f03	tests: stabling functional tests (#355 ) * tests: stabling functional tests * functional removed	2019-06-20 15:30:47 +03:00
shadiendrawis	8e812ef82f	Coach as a library (#348 ) * CoachInterface + tutorial * Some improvements and typo fixes * merge tutorial 0 and 4 * typo fix + additional tutorial changes * tutorial changes * added reading signals and experiment path argument	2019-06-19 18:05:03 +03:00
Gal Leibovich	7eb884c5b2	TD3 (#338 )	2019-06-16 11:11:21 +03:00
Timo Kaufmann	8df3c46756	Do not hardcode path to bash (#332 )	2019-06-10 20:10:28 +03:00
Gal Leibovich	a1bb8eef89	DDPG Critic Head Bug Fix (#344 ) * A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean	2019-06-05 17:47:56 +03:00
anabwan	0aa5359d63	tests: added assert for cp param and changing test args order (#342 )	2019-06-05 00:16:50 +03:00
Gal Leibovich	4c996e147e	applying filters for a csv loaded dataset + some bug-fixes in data loading (#319 )	2019-05-28 15:44:55 +03:00
anabwan	f5ba14575c	tests: print logs on failure + fix -cp param (#327 ) * tests: pring logs on failure * fix import * added job to circleci * fix functional * removed debug job	2019-05-28 13:45:43 +03:00
Gal Leibovich	251dc9ccc0	Preset dependent number of csv read attempts in golden testing (#334 )	2019-05-28 12:19:57 +03:00
Gal Leibovich	9e9c4fd332	Create a dataset using an agent (#306 ) Generate a dataset using an agent (allowing to select between this and a random dataset)	2019-05-28 09:34:49 +03:00
anabwan	342b7184bc	Enabling Coach Documentation to be run even when environments are not installed (#326 )	2019-05-27 10:46:07 +03:00
James Casbon	2b7d536da4	Add head regularization costs to tf.losses (#292 )	2019-05-26 17:15:42 +03:00
anabwan	3b6e413532	tests: fix traces and changing workflow jobs (#316 ) * tests: fix traces export presets * tests: increase time for traces * tests * remove approval * fix approval * fix ap * change worflow jobs * fix path * fix repo path * change run traces * adding assert * fix assert	2019-05-26 15:27:36 +03:00
anabwan	b567091d2e	removed timestep_limit due to gym version upgrade (#325 ) * removed timestep_limit due to gym version update * removed _past_limit wrapper	2019-05-26 13:58:16 +03:00
Gal Leibovich	30c2b2fc45	moving to skimage.transform.resize (#321 )	2019-05-23 13:38:01 +03:00
Gal Leibovich	acceb03ac0	bug fixes for OPE (#311 )	2019-05-21 16:39:11 +03:00
Gal Leibovich	deb0251367	bug fix following PR #191 (#313 )	2019-05-12 13:42:45 -07:00
Gal Novik	aa9f3cefaf	Printing input size as part of network summary (#310 )	2019-05-12 15:40:02 +03:00
anabwan	ffb55b4142	tests: update traces (#302 ) * Traces folder removed from repo and moved to S3 * Traces jobs and update will use directly the S3 files	2019-05-07 10:04:05 +03:00
anabwan	740359587d	tests: fixed nightly (#301 ) * tests: fixed nightly * tests: temp testing functional tests * tests: temp testing functional tests * tests: add seed to -cp * test: last fix	2019-05-05 08:28:57 +03:00
Gal Leibovich	582921ffe3	OPE: Weighted Importance Sampling (#299 )	2019-05-02 19:25:42 +03:00
guyk1971	74db141d5e	SAC algorithm (#282 ) * SAC algorithm * SAC - updates to agent (learn_from_batch), sac_head and sac_q_head to fix problem in gradient calculation. Now SAC agents is able to train. gym_environment - fixing an error in access to gym.spaces * Soft Actor Critic - code cleanup * code cleanup * V-head initialization fix * SAC benchmarks * SAC Documentation * typo fix * documentation fixes * documentation and version update * README typo	2019-05-01 18:37:49 +03:00
Ajay Deshpande	33dc29ee99	Uploading checkpoint if crd provided (#191 ) * Uploading checkpoint if crd provided * Changing the calculation of total steps because of a recent change in core_types Fixes #195	2019-04-26 12:27:33 -07:00
anabwan	b3db9ce77d	tests: fixed failed tests - stabling CI (#298 ) * tests: stabling CI * tests: fix failed tests - stabling CI * fix get csv files. - fixed seed test * fix clres on conftest - now can modify paths during test run. - this fixed the mxnet checkpoint test * tests: fix comments	2019-04-23 15:12:11 +03:00
Gal Leibovich	9f625c197b	fix for fetch rendering (#297 ) * fix for fetch rendering - removing code which was once required with older gym versions. images are now rendered correctly by default with the latest gym. * fixing mujoco camera id failure	2019-04-21 17:37:14 +03:00
Gal Leibovich	4741b0b916	BCQ variant on top of DDQN (#276 ) * kNN based model for predicting which actions to drop * fix for seeds with batch rl	2019-04-16 17:06:23 +03:00
Federico Andres Lois	bdb9b224a8	Include missing RegressionHead. (#263 )	2019-04-16 15:24:06 +03:00
anabwan	20a8dea0dd	tests: minor fix for functional tests (#289 ) * tests: minor fix for functional tests * tests: fix value	2019-04-15 12:28:23 +03:00
zach dwiel	88f9c926ab	update comment describing why the output filters don't modify Agent.last_action_info	2019-04-09 12:14:27 -04:00
zach dwiel	fd2c210915	rename AgentInterface.emulate_observe_on_trainer or observe_transition and call from AgentInterface.observe	2019-04-09 12:14:27 -04:00
zach dwiel	f8741522e4	merge AgentInterface.emulate_act_on_trainer and AgentInterface.act	2019-04-09 12:14:27 -04:00
zach dwiel	f2fead57e5	change method interface: AgentInterface.emulate_act_on_trainer(transition: Transition) -> emulate_act_on_trainer(action: ActionType)	2019-04-09 12:14:27 -04:00
zach dwiel	b20e795ce0	create method LevelManager.acting_agent()	2019-04-09 12:14:27 -04:00
zach dwiel	54fdfe2da8	simplify rollout worker steps with new magic methods on StepMethod	2019-04-09 12:14:27 -04:00
zach dwiel	2cb078b4c2	add __truediv__, __rtruediv__ and __eq__ to StepMethod	2019-04-09 12:14:27 -04:00
zach dwiel	83da5cde2f	remove unnecessary parentheses	2019-04-09 12:14:27 -04:00
zach dwiel	dddaefb210	fixed bug in rollout worker where total number of improved steps are not taken	2019-04-09 12:14:27 -04:00
zach dwiel	06de3b0f07	update LevelManager type signature	2019-04-09 12:14:27 -04:00
zach dwiel	f16cd3cb1e	remove unused ActionInfo.action_intrinsic_reward	2019-04-09 12:14:27 -04:00
zach dwiel	7d79433c05	remove unused parameter scale_external_reward_by_intrinsic_reward_value	2019-04-09 12:14:27 -04:00
anabwan	881f78f45a	tests: new checkpoint mxnet test + fix utils (#273 ) * tests: new mxnet test + fix utils new test added: - test_restore_checkpoint[tensorflow, mxnet] fix failed tests in CI improve utils * tests: fix comments for mxnet checkpoint test and utils	2019-04-07 07:36:44 +03:00
Zach Dwiel	2291cee2c6	allow serializing from/to arrays/str from GlobalVariableSaver (#285 )	2019-04-04 11:09:19 -04:00
anabwan	cdb8d9e518	tests: fix multi environment variables in configci (#284 ) * tests: fix multi environment variables in configci - fix multi environment vairables in configci - removing bitflip from mujoco tests - add bitflip to gym * tests: disable mujoco_a3c_lstm + fix timeout and fix docker	2019-04-04 16:11:41 +03:00
Scott Leishman	f173e69187	introduce dockerfiles. (#169 ) * introduce dockerfiles. * ensure golden tests are run not just collected. * Skip CI download of dockerfiles. * add StarCraft environment and tests. * add minimaps starcraft validation parameters. * Add functional test running (from Ayoob) * pin mujoco_py version to a 1.5 compatible release. * fix config syntax issue. * pin remaining mujoco_py install calls. * Relax pin of gym version in gym Dockerfile. * update makefile based on functional test filtering.	2019-04-03 19:33:17 +03:00
shadiendrawis	0b808f0794	remove -ept flag (#283 )	2019-04-03 16:32:24 +03:00
anabwan	869bd421a3	tests: added new checkpoint and functional tests (#265 ) * added new tests - test_preset_n_and_ew - test_preset_n_and_ew_and_onnx * code utils improvements (all utils) * improve checkpoint_test * new functionality for functional_test markers and presets lists * removed special environment container * add xfail to certain tests	2019-03-28 13:57:31 -07:00
Gal Leibovich	310d31c227	integration test changes to reach the train part (#254 ) * integration test changes to override heatup to 1000 steps + run each preset for 30 sec (to make sure we reach the train part) * fixes to failing presets uncovered with this change + changes in the golden testing to properly test BatchRL * fix for rainbow dqn * fix to gym_environment (due to a change in Gym 0.12.1) + fix for rainbow DQN + some bug-fix in utils.squeeze_list * fix for NEC agent	2019-03-27 21:14:19 +02:00
Gal Leibovich	6e08c55ad5	Enabling-more-agents-for-Batch-RL-and-cleanup (#258 ) allowing for the last training batch drawn to be smaller than batch_size + adding support for more agents in BatchRL by adding softmax with temperature to the corresponding heads + adding a CartPole_QR_DQN preset with a golden test + cleanups	2019-03-21 16:10:29 +02:00
Gal Leibovich	abec59f367	fixes to rainbow dqn + a cartpole based golden test (#253 )	2019-03-21 12:57:56 +02:00

1 2 3 4 5 ...

275 Commits