Gal Leibovich
c1d1fae342
Distiller's AMC induced changes ( #359 )
...
* override episode rewards with the last transition reward
* EWMA normalization filter
* allowing control over when the pre_network filter runs
2019-08-05 10:24:58 +03:00
Scott Leishman
7df67dafa3
update to point at new CI cluster. ( #385 )
2019-08-04 13:55:04 +03:00
Gal Novik
2697142d5a
Release 1.0.0 ( #382 )
...
* Updating README
* Shortening test cycles
2019-07-24 16:10:58 +03:00
Gal Leibovich
718597ce9a
Fixes to Batch RL tutorial ( #378 )
2019-07-16 11:22:42 +03:00
Gal Novik
0a4cc7e081
Additional cmd line examples ( #377 )
...
Adding command line examples to the Quick Start Guide tutorial
2019-07-15 12:32:59 +03:00
Gal Leibovich
19ad2d60a7
Batch RL Tutorial ( #372 )
2019-07-14 18:43:48 +03:00
Gal Novik
b82414138d
Workaround the OSError due to bad address failure on the CI runs ( #370 )
...
workaround the OSError due to bad address failure on the CI runs
2019-07-07 17:11:19 +03:00
Gal Leibovich
587b74e04a
Remove double call to reset_internal_state() on gym environments ( #364 )
2019-07-02 13:43:23 +03:00
anabwan
a576ab5659
tests: Removed mxnet from functional tests + minor fix on rewards ( #362 )
...
* ci: change workflow
* changed timeout
* fix function reach reward
* print logs
* removing mxnet
* res'
2019-06-27 18:52:29 +03:00
anabwan
30c64d0656
using gym=0.12.5 instead of latest ( #360 )
...
* using gym=0.12.5 instead of latest
* changing docker gym version
* changing dockingfile gym version
2019-06-24 10:34:28 +03:00
Gal Leibovich
d6795bd524
batchnorm fixes + disabling batchnorm in DDPG ( #353 )
...
Co-authored-by: James Casbon <casbon+gh@gmail.com >
2019-06-23 11:28:22 +03:00
anabwan
7b5d6a3f03
tests: stabling functional tests ( #355 )
...
* tests: stabling functional tests
* functional removed
2019-06-20 15:30:47 +03:00
shadiendrawis
8e812ef82f
Coach as a library ( #348 )
...
* CoachInterface + tutorial
* Some improvements and typo fixes
* merge tutorial 0 and 4
* typo fix + additional tutorial changes
* tutorial changes
* added reading signals and experiment path argument
2019-06-19 18:05:03 +03:00
anabwan
1c90bc22a1
ci: using serial jobs in nightly ( #350 )
2019-06-17 10:53:36 +03:00
Gal Leibovich
7eb884c5b2
TD3 ( #338 )
2019-06-16 11:11:21 +03:00
Timo Kaufmann
8df3c46756
Do not hardcode path to bash ( #332 )
2019-06-10 20:10:28 +03:00
Gal Leibovich
a1bb8eef89
DDPG Critic Head Bug Fix ( #344 )
...
* A bug fix for DDPG, where the update to the policy network was based on the sum of the critic's Q predictions on the batch instead of their mean
2019-06-05 17:47:56 +03:00
anabwan
0aa5359d63
tests: added assert for cp param and changing test args order ( #342 )
2019-06-05 00:16:50 +03:00
Gal Novik
e49aac05aa
Update README.md ( #341 )
...
Adding some links to the tutorials from the README
2019-06-04 11:35:34 +03:00
anabwan
f6d5e60eff
Added build base for nightly ( #340 )
...
* Added build base for nightly
* fix requires
* remove commetted code
2019-06-03 23:04:34 +03:00
Gal Novik
6e7e7f6d3d
Update setup.py to 0.12.1 ( #337 )
2019-05-30 10:13:36 +03:00
anabwan
23df868d32
Removed unnecessary futures dependency ( #336 )
2019-05-29 14:34:48 +03:00
Gal Leibovich
4c996e147e
applying filters for a csv loaded dataset + some bug-fixes in data loading ( #319 )
2019-05-28 15:44:55 +03:00
anabwan
6319387357
increase timeout for golden tests ( #335 )
2019-05-28 14:19:11 +03:00
anabwan
f5ba14575c
tests: print logs on failure + fix -cp param ( #327 )
...
* tests: pring logs on failure
* fix import
* added job to circleci
* fix functional
* removed debug job
2019-05-28 13:45:43 +03:00
Gal Leibovich
251dc9ccc0
Preset dependent number of csv read attempts in golden testing ( #334 )
2019-05-28 12:19:57 +03:00
anabwan
ddffac8570
fixed release version ( #333 )
...
* fixed release version
* update docs
2019-05-28 11:11:15 +03:00
Gal Leibovich
9e9c4fd332
Create a dataset using an agent ( #306 )
...
Generate a dataset using an agent (allowing to select between this and a random dataset)
2019-05-28 09:34:49 +03:00
anabwan
342b7184bc
Enabling Coach Documentation to be run even when environments are not installed ( #326 )
2019-05-27 10:46:07 +03:00
James Casbon
2b7d536da4
Add head regularization costs to tf.losses ( #292 )
2019-05-26 17:15:42 +03:00
anabwan
3b6e413532
tests: fix traces and changing workflow jobs ( #316 )
...
* tests: fix traces export presets
* tests: increase time for traces
* tests
* remove approval
* fix approval
* fix ap
* change worflow jobs
* fix path
* fix repo path
* change run traces
* adding assert
* fix assert
2019-05-26 15:27:36 +03:00
anabwan
b567091d2e
removed timestep_limit due to gym version upgrade ( #325 )
...
* removed timestep_limit due to gym version update
* removed _past_limit wrapper
2019-05-26 13:58:16 +03:00
Gal Leibovich
30c2b2fc45
moving to skimage.transform.resize ( #321 )
2019-05-23 13:38:01 +03:00
Gal Leibovich
acceb03ac0
bug fixes for OPE ( #311 )
2019-05-21 16:39:11 +03:00
anabwan
85d70dd7d5
tests: fix traces export presets ( #315 )
2019-05-13 15:32:30 +03:00
anabwan
f78bbbdbd1
tests: weekly deployment ( #304 )
...
* tests: weekly deployment
* running golden_tests
* running all traces
* run time: Friday @ 04:00AM
2019-05-13 14:51:38 +03:00
Gal Leibovich
deb0251367
bug fix following PR #191 ( #313 )
2019-05-12 13:42:45 -07:00
Gal Novik
aa9f3cefaf
Printing input size as part of network summary ( #310 )
2019-05-12 15:40:02 +03:00
anabwan
ffb55b4142
tests: update traces ( #302 )
...
* Traces folder removed from repo and moved to S3
* Traces jobs and update will use directly the S3 files
2019-05-07 10:04:05 +03:00
anabwan
740359587d
tests: fixed nightly ( #301 )
...
* tests: fixed nightly
* tests: temp testing functional tests
* tests: temp testing functional tests
* tests: add seed to -cp
* test: last fix
2019-05-05 08:28:57 +03:00
Gal Leibovich
582921ffe3
OPE: Weighted Importance Sampling ( #299 )
2019-05-02 19:25:42 +03:00
guyk1971
74db141d5e
SAC algorithm ( #282 )
...
* SAC algorithm
* SAC - updates to agent (learn_from_batch), sac_head and sac_q_head to fix problem in gradient calculation. Now SAC agents is able to train.
gym_environment - fixing an error in access to gym.spaces
* Soft Actor Critic - code cleanup
* code cleanup
* V-head initialization fix
* SAC benchmarks
* SAC Documentation
* typo fix
* documentation fixes
* documentation and version update
* README typo
2019-05-01 18:37:49 +03:00
Ajay Deshpande
33dc29ee99
Uploading checkpoint if crd provided ( #191 )
...
* Uploading checkpoint if crd provided
* Changing the calculation of total steps because of a recent change in core_types
Fixes #195
2019-04-26 12:27:33 -07:00
anabwan
b3db9ce77d
tests: fixed failed tests - stabling CI ( #298 )
...
* tests: stabling CI
* tests: fix failed tests - stabling CI
* fix get csv files.
- fixed seed test
* fix clres on conftest - now can modify paths during test run.
- this fixed the mxnet checkpoint test
* tests: fix comments
2019-04-23 15:12:11 +03:00
Gal Leibovich
9f625c197b
fix for fetch rendering ( #297 )
...
* fix for fetch rendering - removing code which was once required with older gym versions. images are now rendered correctly by default with the latest gym.
* fixing mujoco camera id failure
2019-04-21 17:37:14 +03:00
anabwan
f14915cada
tests: removed Starcraft from CI ( #296 )
...
* tests: removed Starcraft from CI
* tests: fix comment
* tests: fix mujoco
2019-04-21 13:51:14 +03:00
Gal Leibovich
4741b0b916
BCQ variant on top of DDQN ( #276 )
...
* kNN based model for predicting which actions to drop
* fix for seeds with batch rl
2019-04-16 17:06:23 +03:00
Federico Andres Lois
bdb9b224a8
Include missing RegressionHead. ( #263 )
2019-04-16 15:24:06 +03:00
anabwan
20a8dea0dd
tests: minor fix for functional tests ( #289 )
...
* tests: minor fix for functional tests
* tests: fix value
2019-04-15 12:28:23 +03:00
zach dwiel
88f9c926ab
update comment describing why the output filters don't modify Agent.last_action_info
2019-04-09 12:14:27 -04:00