1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00
Commit Graph

26 Commits

Author SHA1 Message Date
Gal Leibovich
251dc9ccc0 Preset dependent number of csv read attempts in golden testing (#334) 2019-05-28 12:19:57 +03:00
zach dwiel
7d79433c05 remove unused parameter scale_external_reward_by_intrinsic_reward_value 2019-04-09 12:14:27 -04:00
Gal Leibovich
e3c7e526c7 Batch RL (#238) 2019-03-19 18:07:09 +02:00
Gal Leibovich
d6158a5cfc restoring from a checkpoint file (#247) 2019-03-17 16:28:09 +02:00
Gal Novik
10220be9be Adding support for evaluation only mode with predefined number of steps (#225) 2019-03-03 10:03:45 +02:00
Gal Leibovich
f9ee526536 Fix for issue #128 - circular DQN import (#130) 2018-12-16 16:06:44 +02:00
Sina Afrooze
87a7848b0a Moved tf.variable_scope and tf.device calls to framework-specific architecture (#136) 2018-11-22 22:52:21 +02:00
Gal Leibovich
a112ee69f6 Save filters' internal state (#127)
* save filters internal state

* moving the restore to be made from within NumpyRunningStats
2018-11-20 17:21:48 +02:00
Gal Leibovich
d4d06aaea6 remove kubernetes dependency (#117) 2018-11-18 18:10:22 +02:00
Gal Leibovich
9fd4d55623 Making stop condition optional by using a flag (#113)
* apply stop condition flag (default: ignore the stop condition)
2018-11-18 13:37:39 +02:00
Ajay Deshpande
fde73ced13 Simulating the act on the trainer. (#65)
* Remove the use of daemon threads for Redis subscribe.
* Emulate act and observe on trainer side to update internal vars.
2018-11-15 08:38:58 -08:00
Itai Caspi
6d40ad1650 update of api docstrings across coach and tutorials [WIP] (#91)
* updating the documentation website
* adding the built docs
* update of api docstrings across coach and tutorials 0-2
* added some missing api documentation
* New Sphinx based documentation
2018-11-15 15:00:13 +02:00
Ajay Deshpande
875d6ef017 Adding target reward and target sucess (#58)
* Adding target reward

* Adding target successs

* Addressing comments

* Using custom_reward_threshold and target_success_rate

* Adding exit message

* Moving success rate to environment

* Making target_success_rate optional
2018-11-12 15:03:43 -08:00
Itai Caspi
83e0b09a6a adding the missing export_onnx_graph parameter to task parameters (#73) 2018-11-08 12:52:42 +02:00
Gal Leibovich
49dea39d34 N-step returns for rainbow (#67)
* n_step returns for rainbow
* Rename CartPole_PPO -> CartPole_ClippedPPO
2018-11-07 18:33:08 +02:00
Itai Caspi
e7a91b4dc3 Fix cmd line arguments handling (#68)
* refactoring the merging of the task parameters and the command line parameters
* removing some unused command line arguments
* fix for saving checkpoints when not passing through coach.py
2018-11-07 15:47:02 +02:00
Balaji Subramaniam
7e7006305a Integrate coach.py params with distributed Coach. (#42)
* Integrate coach.py params with distributed Coach.
* Minor improvements
- Use enums instead of constants.
- Reduce code duplication.
- Ask experiment name with timeout.
2018-11-05 09:33:30 -08:00
Sina Afrooze
95b4fc6888 Added ability to switch between tensorflow and mxnet using -f commandline argument. (#48)
NOTE: tensorflow framework works fine if mxnet is not installed in env, but mxnet will not work if tensorflow is not installed because of the code in network_wrapper.
2018-10-30 15:29:34 -07:00
zach dwiel
f835ac902c fix renaming: save_checkpoint_sec -> checkpoint_save_secs 2018-10-24 10:52:18 -04:00
Zach Dwiel
61ed6b8ce4 add better defaults to TaskParameters 2018-10-23 16:40:33 -04:00
Gal Leibovich
5a8da90d32 bug-fix for dumping movies (+ small refactoring and rename 'VideoDumpMethod -> 'VideoDumpFilter') 2018-10-21 17:29:10 +03:00
Shadi Endrawis
51726a5b80 network_imporvements branch merge 2018-10-02 13:43:36 +03:00
itaicaspi-intel
a16d724963 removing some of the presets from the trace tests + more robust replay buffer loading 2018-09-12 15:26:16 +03:00
Itai Caspi
72a1d9d426 Itaicaspi/episode reset refactoring (#105)
* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* reordering of the episode reset operation and allowing to store episodes only when they are terminated

* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()

* tests readme file and refactoring of policy optimization agent train function

* Update README.md

* Update README.md

* additional policy optimization train function simplifications

* Updated the traces after the reordering of the environment reset

* docker and jenkins files

* updated the traces to the ones from within the docker container

* updated traces and added control suite to the docker

* updated jenkins file with the intel proxy + updated doom basic a3c test params

* updated line breaks in jenkins file

* added a missing line break in jenkins file

* refining trace tests ignored presets + adding a configurable beta entropy value

* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue

* updated benchmarks for dueling ddqn breakout and pong

* allowing dynamic updates to the loss weights + bug fix in episode.update_returns

* remove docker and jenkins file
2018-09-04 15:07:54 +03:00
Gal Leibovich
1aa2ab0590 parameter noise exploration - using Noisy Nets 2018-08-27 18:19:01 +03:00
Gal Novik
19ca5c24b1 pre-release 0.10.0 2018-08-13 17:11:34 +03:00