1
0
mirror of https://github.com/gryf/coach.git synced 2026-02-11 19:25:53 +01:00
Commit Graph

26 Commits

Author SHA1 Message Date
Gal Leibovich
138ced23ba RL in Large Discrete Action Spaces - Wolpertinger Agent (#394)
* Currently this is specific to the case of discretizing a continuous action space. Can easily be adapted to other case by feeding the kNN otherwise, and removing the usage of a discretizing output action filter
2019-09-08 12:53:49 +03:00
Gal Leibovich
19ad2d60a7 Batch RL Tutorial (#372) 2019-07-14 18:43:48 +03:00
Gal Leibovich
acceb03ac0 bug fixes for OPE (#311) 2019-05-21 16:39:11 +03:00
Gal Leibovich
582921ffe3 OPE: Weighted Importance Sampling (#299) 2019-05-02 19:25:42 +03:00
Gal Leibovich
6e08c55ad5 Enabling-more-agents-for-Batch-RL-and-cleanup (#258)
allowing for the last training batch drawn to be smaller than batch_size + adding support for more agents in BatchRL by adding softmax with temperature to the corresponding heads + adding a CartPole_QR_DQN preset with a golden test + cleanups
2019-03-21 16:10:29 +02:00
Gal Leibovich
e3c7e526c7 Batch RL (#238) 2019-03-19 18:07:09 +02:00
Gal Novik
fc6604c09c added missing license headers 2018-11-27 22:43:40 +02:00
Gal Leibovich
a1c56edd98 Fixes for having NumpySharedRunningStats syncing on multi-node (#139)
1. Having the standard checkpoint prefix in order for the data store to grab it, and sync it to S3.
2. Removing the reference to Redis so that it won't try to pickle that in.
3. Enable restoring a checkpoint into a single-worker run, which was saved by a single-node-multiple-worker run.
2018-11-23 16:11:47 +02:00
Itai Caspi
6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Leo Dirac
2804a7c24f Refactor launcher to be object-oriented (#63)
* Import of annoy library uses failed_import mechanism.
2018-11-10 22:10:19 +02:00
Gal Leibovich
49dea39d34 N-step returns for rainbow (#67)
* n_step returns for rainbow
* Rename CartPole_PPO -> CartPole_ClippedPPO
2018-11-07 18:33:08 +02:00
Ajay Deshpande
6b2de6ba6d Adding initial interface for backend and redis pubsub (#19)
* Adding initial interface for backend and redis pubsub

* Addressing comments, adding super in all memories

* Removing distributed experience replay
2018-10-23 16:51:48 -04:00
Ajay Deshpande
98850464cc Adding nfs pv, pvc, waiting for memory to be full 2018-10-23 16:50:48 -04:00
Ajay Deshpande
ce9838a7d6 Adding kubernetes orchestrator for rollouts, adding requirements for incremental docker builds 2018-10-23 16:46:04 -04:00
Ajay Deshpande
21f8ca3978 Removing comments and pytests 2018-10-23 16:40:33 -04:00
Ajay Deshpande
5a54f67a63 Adding distributed experience replay 2018-10-23 16:40:33 -04:00
Zach Dwiel
5758c2f23e typo; increased detail in comment 2018-10-23 16:35:06 -04:00
Zach Dwiel
a1295d16b3 first pass that transition collection interface 2018-10-23 16:35:06 -04:00
Zach Dwiel
9f1f9e5ab4 replace ExperienceReplay._num_transitions with len(ExperienceReplay.transitions) 2018-10-23 16:34:38 -04:00
Zach Dwiel
cccfe88f9b remove unused method: update_last_transition_info 2018-10-23 16:34:38 -04:00
itaicaspi-intel
d3f97cd93b initial CIL implementation (WIP) 2018-09-13 15:29:29 +03:00
itaicaspi-intel
607ef17431 added a simple progress bar implementation 2018-09-13 14:21:38 +03:00
itaicaspi-intel
a16d724963 removing some of the presets from the trace tests + more robust replay buffer loading 2018-09-12 15:26:16 +03:00
itaicaspi-intel
a9bd1047c4 load and save function for non-episodic replay buffers + carla improvements + network bug fixes 2018-09-12 15:26:16 +03:00
itaicaspi-intel
658b437079 removing datasets + imports optimization 2018-08-27 10:54:11 +03:00
Gal Novik
19ca5c24b1 pre-release 0.10.0 2018-08-13 17:11:34 +03:00