Zach Dwiel
d8f5a35013
fix qr_dqn_agent
2018-02-21 10:05:57 -05:00
Zach Dwiel
e1ad86417f
fix n_step_q_agent
2018-02-21 10:05:57 -05:00
Zach Dwiel
5cf10e5f52
fix bug in ddpg
2018-02-21 10:05:57 -05:00
Zach Dwiel
8248caf35e
fix more agents
2018-02-21 10:05:57 -05:00
Zach Dwiel
98f57a0d87
fix ddpg
2018-02-21 10:05:57 -05:00
Zach Dwiel
943e41ba58
fix nec_agent
2018-02-21 10:05:57 -05:00
Zach Dwiel
ee6e0bdc3b
fix keep_dims -> keepdims
2018-02-21 10:05:57 -05:00
Zach Dwiel
39a28aba95
fix clipped ppo
2018-02-21 10:05:57 -05:00
Zach Dwiel
85afb86893
temp commit
2018-02-21 10:05:57 -05:00
Itai Caspi
55c8c87afc
allow visualizing the observation + bug fixes to coach summary
2018-02-15 13:47:14 +02:00
Itai Caspi
ba96e585d2
appending csv's from logger instead of rewriting them
2018-02-12 14:52:50 +02:00
Gal Leibovich
7c8962c991
adding support in tensorboard ( #52 )
...
* bug-fix in architecture.py where additional fetches would acquire more entries than it should
* change in run_test to allow ignoring some test(s)
2018-02-05 15:21:49 +02:00
Zach Dwiel
fff8c8f568
provide a helpful error message in the event that an exploration policy returns a vector of actions instead of a single action during value optimization agent
2018-01-20 14:11:24 -05:00
Itai Caspi
eeb3ec5497
fixed the LSTM middleware initialization
2018-01-09 10:26:15 +02:00
Zach Dwiel
6c79a442f2
update nec and value optimization agents to work with recurrent middleware
2018-01-05 20:16:51 -05:00
Zach Dwiel
37e317682b
allow missing carla environment and missing matplotlib package
2017-12-20 11:47:14 +02:00
Itai Caspi
125c7ee38d
Release 0.9
...
Main changes are detailed below:
New features -
* CARLA 0.7 simulator integration
* Human control of the game play
* Recording of human game play and storing / loading the replay buffer
* Behavioral cloning agent and presets
* Golden tests for several presets
* Selecting between deep / shallow image embedders
* Rendering through pygame (with some boost in performance)
API changes -
* Improved environment wrapper API
* Added an evaluate flag to allow convenient evaluation of existing checkpoints
* Improve frameskip definition in Gym
Bug fixes -
* Fixed loading of checkpoints for agents with more than one network
* Fixed the N Step Q learning agent python3 compatibility
2017-12-19 19:27:16 +02:00
Itai Caspi
11faf19649
QR-DQN bug fix and imporvements ( #30 )
...
* bug fix - QR-DQN using error instead of abs-error in the quantile huber loss
* improvement - QR-DQN sorting the quantile only once instead of batch_size times
* new feature - adding the Breakout QRDQN preset (verified to achieve good results)
2017-11-29 14:01:59 +02:00
galleibo-intel
3c330768f0
Fix for NEC not saving the DND when saving a model
2017-11-09 19:13:23 +02:00
galleibo-intel
f47b8092af
fix for intel optimized tensorflow on distributed runs + adding coach_env to .gitignore
2017-11-06 19:41:32 +02:00
Itai Caspi
a8bce9828c
new feature - implementation of Quantile Regression DQN ( https://arxiv.org/pdf/1710.10044v1.pdf )
...
API change - Distributional DQN renamed to Categorical DQN
2017-11-01 15:09:07 +02:00
Itai Caspi
913ab75e8a
bug fix - preventing crashes when the probability of one of the actions is 0 in the policy head
2017-10-31 10:51:48 +02:00
Itai Caspi
1918f16079
imporved API for getting / setting variables within the graph
2017-10-31 10:51:48 +02:00
cxx
f43c951c2d
Unify base class using new-style (object).
2017-10-26 12:33:09 +03:00
Itai Caspi
39cf78074c
preventing the evaluation agent from getting stuck in bad policies by updating from the global network during episodes
2017-10-25 10:28:45 +03:00
Gal Leibovich
eb0b57d7fa
Updating PPO references per issue #11
2017-10-24 16:57:44 +03:00
Gal Leibovich
1d4c3455e7
coach v0.8.0
2017-10-19 13:10:15 +03:00