Guy Jacob
f52ff1784d
Fix breaking change from minio update ( #469 )
...
`ResponseError` replaced by `S3Error` in new minio version
2020-12-15 10:02:16 +02:00
Guy Jacob
103d4477eb
Disable NumPy and TF2 related warnings ( #463 )
2020-09-24 15:11:45 +03:00
Gal Novik
c9738280fd
Require Python 3.6 + Changes to CI configuration ( #452 )
...
* Change build_*_env jobs to pull base image of current "tag"
instead of "master" image
* Change nightly flow so build_*_env jobs now gated by build_base (so
change in previous bullet works in nightly)
* Bugfix in CheckpointDataStore: Call to object.__init__ with
parameters
* Disabling unstable Doom A3C and ACER golden tests
2020-07-26 16:11:22 +03:00
Gal Novik
79b05a8105
Wolpertinger preset failure fix ( #434 )
...
Numpy 1.18 fails to cast float to int as part of the wolpertinger preset run
2020-01-14 16:26:38 +02:00
shadiendrawis
188b86369a
fix e-greedy in case action values were equal ( #423 )
2019-11-10 17:20:44 +02:00
shadiendrawis
6ca91b9090
add reset internal state to rollout worker ( #421 )
2019-11-03 14:42:51 +02:00
Gal Leibovich
66fada7f78
Remove assertion from BatchRLGraphManager
2019-10-22 11:54:14 +03:00
shadiendrawis
5ad5a58350
fix atari stack overflow ( #412 )
2019-10-06 18:14:21 +03:00
shadiendrawis
0a712ecc94
Fix numpy shared running stats to support images ( #411 )
2019-10-06 12:16:38 +03:00
Gal Leibovich
79a4161eca
Workaround for dumping gifs through the Python API ( #405 )
2019-09-26 12:21:25 +03:00
Gal Leibovich
c7949d7011
Fix Atari Schedule Heatup
2019-09-08 16:57:38 +03:00
Gal Leibovich
138ced23ba
RL in Large Discrete Action Spaces - Wolpertinger Agent ( #394 )
...
* Currently this is specific to the case of discretizing a continuous action space. Can easily be adapted to other case by feeding the kNN otherwise, and removing the usage of a discretizing output action filter
2019-09-08 12:53:49 +03:00
Zach Dwiel
7b0fccb041
Add RedisDataStore ( #295 )
...
* GraphManager.set_session also sets self.sess
* make sure that GraphManager.fetch_from_worker uses training phase
* remove unnecessary phase setting in training worker
* reorganize rollout worker
* provide default name to GlobalVariableSaver.__init__ since it isn't really used anyway
* allow dividing TrainingSteps and EnvironmentSteps
* add timestamps to the log
* added redis data store
* conflict merge fix
2019-08-28 21:15:58 +03:00
Gal Leibovich
c1d1fae342
Distiller's AMC induced changes ( #359 )
...
* override episode rewards with the last transition reward
* EWMA normalization filter
* allowing control over when the pre_network filter runs
2019-08-05 10:24:58 +03:00
Gal Novik
2697142d5a
Release 1.0.0 ( #382 )
...
* Updating README
* Shortening test cycles
2019-07-24 16:10:58 +03:00
Gal Leibovich
19ad2d60a7
Batch RL Tutorial ( #372 )
2019-07-14 18:43:48 +03:00
Gal Novik
b82414138d
Workaround the OSError due to bad address failure on the CI runs ( #370 )
...
workaround the OSError due to bad address failure on the CI runs
2019-07-07 17:11:19 +03:00
Gal Leibovich
587b74e04a
Remove double call to reset_internal_state() on gym environments ( #364 )
2019-07-02 13:43:23 +03:00
anabwan
a576ab5659
tests: Removed mxnet from functional tests + minor fix on rewards ( #362 )
...
* ci: change workflow
* changed timeout
* fix function reach reward
* print logs
* removing mxnet
* res'
2019-06-27 18:52:29 +03:00
Gal Leibovich
d6795bd524
batchnorm fixes + disabling batchnorm in DDPG ( #353 )
...
Co-authored-by: James Casbon <casbon+gh@gmail.com >
2019-06-23 11:28:22 +03:00
anabwan
7b5d6a3f03
tests: stabling functional tests ( #355 )
...
* tests: stabling functional tests
* functional removed
2019-06-20 15:30:47 +03:00
shadiendrawis
8e812ef82f
Coach as a library ( #348 )
...
* CoachInterface + tutorial
* Some improvements and typo fixes
* merge tutorial 0 and 4
* typo fix + additional tutorial changes
* tutorial changes
* added reading signals and experiment path argument
2019-06-19 18:05:03 +03:00
Gal Leibovich
7eb884c5b2
TD3 ( #338 )
2019-06-16 11:11:21 +03:00
Timo Kaufmann
8df3c46756
Do not hardcode path to bash ( #332 )
2019-06-10 20:10:28 +03:00
Gal Leibovich
a1bb8eef89
DDPG Critic Head Bug Fix ( #344 )
...
* A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean
2019-06-05 17:47:56 +03:00
anabwan
0aa5359d63
tests: added assert for cp param and changing test args order ( #342 )
2019-06-05 00:16:50 +03:00
Gal Leibovich
4c996e147e
applying filters for a csv loaded dataset + some bug-fixes in data loading ( #319 )
2019-05-28 15:44:55 +03:00
anabwan
f5ba14575c
tests: print logs on failure + fix -cp param ( #327 )
...
* tests: pring logs on failure
* fix import
* added job to circleci
* fix functional
* removed debug job
2019-05-28 13:45:43 +03:00
Gal Leibovich
251dc9ccc0
Preset dependent number of csv read attempts in golden testing ( #334 )
2019-05-28 12:19:57 +03:00
Gal Leibovich
9e9c4fd332
Create a dataset using an agent ( #306 )
...
Generate a dataset using an agent (allowing to select between this and a random dataset)
2019-05-28 09:34:49 +03:00
anabwan
342b7184bc
Enabling Coach Documentation to be run even when environments are not installed ( #326 )
2019-05-27 10:46:07 +03:00
James Casbon
2b7d536da4
Add head regularization costs to tf.losses ( #292 )
2019-05-26 17:15:42 +03:00
anabwan
3b6e413532
tests: fix traces and changing workflow jobs ( #316 )
...
* tests: fix traces export presets
* tests: increase time for traces
* tests
* remove approval
* fix approval
* fix ap
* change worflow jobs
* fix path
* fix repo path
* change run traces
* adding assert
* fix assert
2019-05-26 15:27:36 +03:00
anabwan
b567091d2e
removed timestep_limit due to gym version upgrade ( #325 )
...
* removed timestep_limit due to gym version update
* removed _past_limit wrapper
2019-05-26 13:58:16 +03:00
Gal Leibovich
30c2b2fc45
moving to skimage.transform.resize ( #321 )
2019-05-23 13:38:01 +03:00
Gal Leibovich
acceb03ac0
bug fixes for OPE ( #311 )
2019-05-21 16:39:11 +03:00
Gal Leibovich
deb0251367
bug fix following PR #191 ( #313 )
2019-05-12 13:42:45 -07:00
Gal Novik
aa9f3cefaf
Printing input size as part of network summary ( #310 )
2019-05-12 15:40:02 +03:00
anabwan
ffb55b4142
tests: update traces ( #302 )
...
* Traces folder removed from repo and moved to S3
* Traces jobs and update will use directly the S3 files
2019-05-07 10:04:05 +03:00
anabwan
740359587d
tests: fixed nightly ( #301 )
...
* tests: fixed nightly
* tests: temp testing functional tests
* tests: temp testing functional tests
* tests: add seed to -cp
* test: last fix
2019-05-05 08:28:57 +03:00
Gal Leibovich
582921ffe3
OPE: Weighted Importance Sampling ( #299 )
2019-05-02 19:25:42 +03:00
guyk1971
74db141d5e
SAC algorithm ( #282 )
...
* SAC algorithm
* SAC - updates to agent (learn_from_batch), sac_head and sac_q_head to fix problem in gradient calculation. Now SAC agents is able to train.
gym_environment - fixing an error in access to gym.spaces
* Soft Actor Critic - code cleanup
* code cleanup
* V-head initialization fix
* SAC benchmarks
* SAC Documentation
* typo fix
* documentation fixes
* documentation and version update
* README typo
2019-05-01 18:37:49 +03:00
Ajay Deshpande
33dc29ee99
Uploading checkpoint if crd provided ( #191 )
...
* Uploading checkpoint if crd provided
* Changing the calculation of total steps because of a recent change in core_types
Fixes #195
2019-04-26 12:27:33 -07:00
anabwan
b3db9ce77d
tests: fixed failed tests - stabling CI ( #298 )
...
* tests: stabling CI
* tests: fix failed tests - stabling CI
* fix get csv files.
- fixed seed test
* fix clres on conftest - now can modify paths during test run.
- this fixed the mxnet checkpoint test
* tests: fix comments
2019-04-23 15:12:11 +03:00
Gal Leibovich
9f625c197b
fix for fetch rendering ( #297 )
...
* fix for fetch rendering - removing code which was once required with older gym versions. images are now rendered correctly by default with the latest gym.
* fixing mujoco camera id failure
2019-04-21 17:37:14 +03:00
Gal Leibovich
4741b0b916
BCQ variant on top of DDQN ( #276 )
...
* kNN based model for predicting which actions to drop
* fix for seeds with batch rl
2019-04-16 17:06:23 +03:00
Federico Andres Lois
bdb9b224a8
Include missing RegressionHead. ( #263 )
2019-04-16 15:24:06 +03:00
anabwan
20a8dea0dd
tests: minor fix for functional tests ( #289 )
...
* tests: minor fix for functional tests
* tests: fix value
2019-04-15 12:28:23 +03:00
zach dwiel
88f9c926ab
update comment describing why the output filters don't modify Agent.last_action_info
2019-04-09 12:14:27 -04:00
zach dwiel
fd2c210915
rename AgentInterface.emulate_observe_on_trainer or observe_transition and call from AgentInterface.observe
2019-04-09 12:14:27 -04:00