1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00
Commit Graph

505 Commits

Author SHA1 Message Date
Gal Novik
c9738280fd Require Python 3.6 + Changes to CI configuration (#452)
* Change build_*_env jobs to pull base image of current "tag"
  instead of "master" image
* Change nightly flow so build_*_env jobs now gated by build_base (so
  change in previous bullet works in nightly)
* Bugfix in CheckpointDataStore: Call to object.__init__ with
  parameters
* Disabling unstable Doom A3C and ACER golden tests
2020-07-26 16:11:22 +03:00
Guy Jacob
a6689b6036 Update cluster name in .circleci/config.yml (now all locations) 2020-06-24 16:18:49 +03:00
Guy Jacob
6658bfa429 Update cluster name in .circleci/config.yml 2020-06-24 15:24:41 +03:00
Gal Novik
f3ce685cb1 Upgrading Pillow version due to security vulnerability (#444) 2020-04-22 20:52:24 +03:00
Gal Novik
79b05a8105 Wolpertinger preset failure fix (#434)
Numpy 1.18 fails to cast float to int as part of the wolpertinger preset run
2020-01-14 16:26:38 +02:00
Dan Elbaz
525a22cb5b Roll-back bokeh to version 1.0.4 (#431)
Roll back bokeh to version 1.0.4
2019-12-23 09:33:53 +02:00
Brian Broll
0867d8d0fb Fixed typo: Nerual -> Neural (#425) 2019-11-16 21:13:24 +02:00
shadiendrawis
188b86369a fix e-greedy in case action values were equal (#423) 2019-11-10 17:20:44 +02:00
shadiendrawis
6ca91b9090 add reset internal state to rollout worker (#421) 2019-11-03 14:42:51 +02:00
Gal Leibovich
e288a552dd Update requirements.txt (#422) 2019-10-28 18:30:48 +02:00
Gal Leibovich
66fada7f78 Remove assertion from BatchRLGraphManager 2019-10-22 11:54:14 +03:00
shadiendrawis
6db695ad8a freeze tensorflow version to <= 1.14.0 (#416) 2019-10-10 17:47:25 +03:00
shadiendrawis
5ad5a58350 fix atari stack overflow (#412) 2019-10-06 18:14:21 +03:00
shadiendrawis
0a712ecc94 Fix numpy shared running stats to support images (#411) 2019-10-06 12:16:38 +03:00
Gal Leibovich
79a4161eca Workaround for dumping gifs through the Python API (#405) 2019-09-26 12:21:25 +03:00
Pi Esposito
9e82c06be3 importing heads parameters from the correct file on tutorial #1 (#403) 2019-09-24 20:44:49 +03:00
Gal Novik
34bc292e60 Limiting intel-tensorflow version to 1.13.1 to re-enable CI; Updating nightly schedule to run on Saturdays as well 2019-09-23 12:52:00 +03:00
Gal Novik
0704260b5d Updating EKS cluster name 2019-09-20 16:12:35 +03:00
Gal Novik
b5d66c0942 Removing CARLA docker file from README (#402) 2019-09-16 07:17:58 +03:00
Gal Leibovich
c7949d7011 Fix Atari Schedule Heatup 2019-09-08 16:57:38 +03:00
Gal Novik
13a4a09f72 removing weekly tests (#398) 2019-09-08 14:04:24 +03:00
Gal Leibovich
138ced23ba RL in Large Discrete Action Spaces - Wolpertinger Agent (#394)
* Currently this is specific to the case of discretizing a continuous action space. Can easily be adapted to other case by feeding the kNN otherwise, and removing the usage of a discretizing output action filter
2019-09-08 12:53:49 +03:00
shadiendrawis
fc50398544 typo fix (#396) 2019-09-04 12:40:23 +03:00
Zach Dwiel
7b0fccb041 Add RedisDataStore (#295)
* GraphManager.set_session also sets self.sess

* make sure that GraphManager.fetch_from_worker uses training phase

* remove unnecessary phase setting in training worker

* reorganize rollout worker

* provide default name to GlobalVariableSaver.__init__ since it isn't really used anyway

* allow dividing TrainingSteps and EnvironmentSteps

* add timestamps to the log

* added redis data store

* conflict merge fix
2019-08-28 21:15:58 +03:00
Scott Leishman
34e1c04f29 further CI cluster name updates. (#387) 2019-08-06 10:18:07 +03:00
Gal Novik
92460736bc Updated tutorial and docs (#386)
Improved getting started tutorial, and updated docs to point to version 1.0.0
2019-08-05 16:46:15 +03:00
Gal Leibovich
c1d1fae342 Distiller's AMC induced changes (#359)
* override episode rewards with the last transition reward

* EWMA normalization filter

* allowing control over when the pre_network filter runs
2019-08-05 10:24:58 +03:00
Scott Leishman
7df67dafa3 update to point at new CI cluster. (#385) 2019-08-04 13:55:04 +03:00
Gal Novik
2697142d5a Release 1.0.0 (#382)
* Updating README
* Shortening test cycles
2019-07-24 16:10:58 +03:00
Gal Leibovich
718597ce9a Fixes to Batch RL tutorial (#378) 2019-07-16 11:22:42 +03:00
Gal Novik
0a4cc7e081 Additional cmd line examples (#377)
Adding command line examples to the Quick Start Guide tutorial
2019-07-15 12:32:59 +03:00
Gal Leibovich
19ad2d60a7 Batch RL Tutorial (#372) 2019-07-14 18:43:48 +03:00
Gal Novik
b82414138d Workaround the OSError due to bad address failure on the CI runs (#370)
workaround the OSError due to bad address failure on the CI runs
2019-07-07 17:11:19 +03:00
Gal Leibovich
587b74e04a Remove double call to reset_internal_state() on gym environments (#364) 2019-07-02 13:43:23 +03:00
anabwan
a576ab5659 tests: Removed mxnet from functional tests + minor fix on rewards (#362)
* ci: change workflow

* changed timeout

* fix function reach reward

* print logs

* removing mxnet

* res'
2019-06-27 18:52:29 +03:00
anabwan
30c64d0656 using gym=0.12.5 instead of latest (#360)
* using gym=0.12.5 instead of latest

* changing docker gym version

* changing dockingfile gym version
2019-06-24 10:34:28 +03:00
Gal Leibovich
d6795bd524 batchnorm fixes + disabling batchnorm in DDPG (#353)
Co-authored-by: James Casbon <casbon+gh@gmail.com>
2019-06-23 11:28:22 +03:00
anabwan
7b5d6a3f03 tests: stabling functional tests (#355)
* tests: stabling functional tests

* functional removed
2019-06-20 15:30:47 +03:00
shadiendrawis
8e812ef82f Coach as a library (#348)
* CoachInterface + tutorial

* Some improvements and typo fixes

* merge tutorial 0 and 4

* typo fix + additional tutorial changes

* tutorial changes

* added reading signals and experiment path argument
2019-06-19 18:05:03 +03:00
anabwan
1c90bc22a1 ci: using serial jobs in nightly (#350) 2019-06-17 10:53:36 +03:00
Gal Leibovich
7eb884c5b2 TD3 (#338) 2019-06-16 11:11:21 +03:00
Timo Kaufmann
8df3c46756 Do not hardcode path to bash (#332) 2019-06-10 20:10:28 +03:00
Gal Leibovich
a1bb8eef89 DDPG Critic Head Bug Fix (#344)
* A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean
2019-06-05 17:47:56 +03:00
anabwan
0aa5359d63 tests: added assert for cp param and changing test args order (#342) 2019-06-05 00:16:50 +03:00
Gal Novik
e49aac05aa Update README.md (#341)
Adding some links to the tutorials from the README
2019-06-04 11:35:34 +03:00
anabwan
f6d5e60eff Added build base for nightly (#340)
* Added build base for nightly

* fix requires

* remove commetted code
2019-06-03 23:04:34 +03:00
Gal Novik
6e7e7f6d3d Update setup.py to 0.12.1 (#337) 2019-05-30 10:13:36 +03:00
anabwan
23df868d32 Removed unnecessary futures dependency (#336) 2019-05-29 14:34:48 +03:00
Gal Leibovich
4c996e147e applying filters for a csv loaded dataset + some bug-fixes in data loading (#319) 2019-05-28 15:44:55 +03:00
anabwan
6319387357 increase timeout for golden tests (#335) 2019-05-28 14:19:11 +03:00