* SAC algorithm
* SAC - updates to agent (learn_from_batch), sac_head and sac_q_head to fix problem in gradient calculation. Now SAC agents is able to train.
gym_environment - fixing an error in access to gym.spaces
* Soft Actor Critic - code cleanup
* code cleanup
* V-head initialization fix
* SAC benchmarks
* SAC Documentation
* typo fix
* documentation fixes
* documentation and version update
* README typo
* initial ACER commit
* Code cleanup + several fixes
* Q-retrace bug fix + small clean-ups
* added documentation for acer
* ACER benchmarks
* update benchmarks table
* Add nightly running of golden and trace tests. (#202)
Resolves#200
* comment out nightly trace tests until values reset.
* remove redundant observe ignore (#168)
* ensure nightly test env containers exist. (#205)
Also bump integration test timeout
* wxPython removal (#207)
Replacing wxPython with Python's Tkinter.
Also removing the option to choose multiple files as it is unused and causes errors, and fixing the load file/directory spinner.
* Create CONTRIBUTING.md (#210)
* Create CONTRIBUTING.md. Resolves#188
* run nightly golden tests sequentially. (#217)
Should reduce resource requirements and potential CPU contention but increases
overall execution time.
* tests: added new setup configuration + test args (#211)
- added utils for future tests and conftest
- added test args
* new docs build
* golden test update
* reordering of the episode reset operation and allowing to store episodes only when they are terminated
* reordering of the episode reset operation and allowing to store episodes only when they are terminated
* revert tensorflow-gpu to 1.9.0 + bug fix in should_train()
* tests readme file and refactoring of policy optimization agent train function
* Update README.md
* Update README.md
* additional policy optimization train function simplifications
* Updated the traces after the reordering of the environment reset
* docker and jenkins files
* updated the traces to the ones from within the docker container
* updated traces and added control suite to the docker
* updated jenkins file with the intel proxy + updated doom basic a3c test params
* updated line breaks in jenkins file
* added a missing line break in jenkins file
* refining trace tests ignored presets + adding a configurable beta entropy value
* switch the order of trace and golden tests in jenkins + fix golden tests processes not killed issue
* updated benchmarks for dueling ddqn breakout and pong
* allowing dynamic updates to the loss weights + bug fix in episode.update_returns
* remove docker and jenkins file
* Multiple improvements and bug fixes:
* Using lazy stacking to save on memory when using a replay buffer
* Remove step counting for evaluation episodes
* Reset game between heatup and training
* Major bug fixes in NEC (is reproducing the paper results for pong now)
* Image input rescaling to 0-1 is now optional
* Change the terminal title to be the experiment name
* Observation cropping for atari is now optional
* Added random number of noop actions for gym to match the dqn paper
* Fixed a bug where the evaluation episodes won't start with the max possible ale lives
* Added a script for plotting the results of an experiment over all the atari games