Release 1.0.0 (#382)

* Updating README * Shortening test cycles
2025-12-17 19:20:19 +01:00 · 2019-07-24 16:10:58 +03:00
parent 718597ce9a
commit 2697142d5a
5 changed files with 46 additions and 38 deletions
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -731,18 +731,19 @@ workflows:
      - functional_tests:
          requires:
            - build_base
-      - functional_test_doom:
-          requires:
-            - build_doom_env
-            - functional_tests
-      - functional_test_mujoco:
-          requires:
-            - build_mujoco_env
-            - functional_test_doom
+#      - functional_test_doom:
+#          requires:
+#            - build_doom_env
+#            - functional_tests
+#      - functional_test_mujoco:
+#          requires:
+#            - build_mujoco_env
+#            - functional_test_doom
      - golden_test_gym:
          requires:
            - build_gym_env
-            - functional_test_mujoco
+#            - functional_test_mujoco
+            - functional_tests
      - golden_test_doom:
          requires:
            - build_doom_env
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -54,7 +54,7 @@ Coach is released as two pypi packages:

 Each pypi package release has a GitHub release and tag with the same version number. The numbers are of the X.Y.Z format, where

-X - zero in the near future, may change when Coach is feature complete 
+X - currently one, will be incremented on major API changes 

 Y - major releases with new features

--- a/README.md
+++ b/README.md
@@ -29,20 +29,23 @@ coach -p CartPole_DQN -r
 * [Release 0.9.0](https://ai.intel.com/reinforcement-learning-coach-carla-qr-dqn/)
 * [Release 0.10.0](https://ai.intel.com/introducing-reinforcement-learning-coach-0-10-0/)
 * [Release 0.11.0](https://ai.intel.com/rl-coach-data-science-at-scale)
-* Release 0.12.0 (current release)
+* [Release 0.12.0](https://github.com/NervanaSystems/coach/releases/tag/v0.12.0) 
+* Release 1.0.0 (current release)

-Contacting the Coach development team is also possible through the email [coach@intel.com](coach@intel.com)
+Contacting the Coach development team is also possible over [email](mailto:coach@intel.com)

 ## Table of Contents

 - [Coach](#coach)
-  * [Overview](#overview)
  * [Benchmarks](#benchmarks)
-  * [Documentation](#documentation)
  * [Installation](#installation)
-  * [Usage](#usage)
-    + [Running Coach](#running-coach)
-    + [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization)
+  * [Getting Started](#getting-started)
+    * [Tutorials and Documentation](#tutorials-and-documentation)
+    * [Basic Usage](#basic-usage)
+      * [Running Coach](#running-coach)
+      * [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization)
+    * [Distributed Multi-Node Coach](#distributed-multi-node-coach)
+    * [Batch Reinforcement Learning](#batch-reinforcement-learning)
  * [Supported Environments](#supported-environments)
  * [Supported Algorithms](#supported-algorithms)
  * [Citation](#citation)
@@ -52,13 +55,6 @@ Contacting the Coach development team is also possible through the email [coach@

 One of the main challenges when building a research project, or a solution based on a published algorithm, is getting a concrete and reliable baseline that reproduces the algorithm's results, as reported by its authors. To address this problem, we are releasing a set of [benchmarks](benchmarks) that shows Coach reliably reproduces many state of the art algorithm results.

-## Documentation
-
-Framework documentation, algorithm description and instructions on how to contribute a new agent/environment can be found [here](https://nervanasystems.github.io/coach/).
-
-Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment can be found [here](https://github.com/NervanaSystems/coach/tree/master/tutorials).
-
-
 ## Installation

 Note: Coach has only been tested on Ubuntu 16.04 LTS, and with Python 3.5.
@@ -113,9 +109,16 @@ If a GPU is present, Coach's pip package will install tensorflow-gpu, by default

 In addition to OpenAI Gym, several other environments were tested and are supported. Please follow the instructions in the Supported Environments section below in order to install more environments.

-## Usage
+## Getting Started

-### Running Coach
+### Tutorials and Documentation
+[Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment](https://github.com/NervanaSystems/coach/tree/master/tutorials).
+
+[Framework documentation, algorithm description and instructions on how to contribute a new agent/environment](https://nervanasystems.github.io/coach/).
+
+### Basic Usage
+
+#### Running Coach

 To allow reproducing results in Coach, we defined a mechanism called _preset_. 
 There are several available presets under the `presets` directory.
@@ -167,17 +170,7 @@ It is easy to create new presets for different levels or environments by followi

 More usage examples can be found [here](https://github.com/NervanaSystems/coach/blob/master/tutorials/0.%20Quick%20Start%20Guide.ipynb).

-### Distributed Multi-Node Coach
-
-As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents.
-For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html).
-
-### Batch Reinforcement Learning
-
-Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach. 
-There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb). 
-
-### Running Coach Dashboard (Visualization)
+#### Running Coach Dashboard (Visualization)
 Training an agent to solve an environment can be tricky, at times. 

 In order to debug the training process, Coach outputs several signals, per trained algorithm, in order to track algorithmic performance. 
@@ -195,6 +188,17 @@ dashboard
 <img src="img/dashboard.gif" alt="Coach Design" style="width: 800px;"/>


+### Distributed Multi-Node Coach
+
+As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents.
+For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html).
+
+### Batch Reinforcement Learning
+
+Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach. 
+There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb). 
+
+
 ## Supported Environments

 * *OpenAI Gym:*
@@ -285,6 +289,7 @@ dashboard
 * [Generalized Advantage Estimation (GAE)](https://arxiv.org/abs/1506.02438) ([code](rl_coach/agents/actor_critic_agent.py#L86))
 * [Sample Efficient Actor-Critic with Experience Replay (ACER)](https://arxiv.org/abs/1611.01224) | **Multi Worker Single Node**  ([code](rl_coach/agents/acer_agent.py))
 * [Soft Actor-Critic (SAC)](https://arxiv.org/abs/1801.01290) ([code](rl_coach/agents/soft_actor_critic_agent.py))
+* [Twin Delayed Deep Deterministic Policy Gradient](https://arxiv.org/pdf/1802.09477.pdf) ([code](rl_coach/agents/td3_agent.py))

 ### General Agents
 * [Direct Future Prediction (DFP)](https://arxiv.org/abs/1611.01779) | **Multi Worker Single Node**  ([code](rl_coach/agents/dfp_agent.py))
--- a/rl_coach/tests/pytest.ini
+++ b/rl_coach/tests/pytest.ini
@@ -5,3 +5,5 @@ markers =
    integration_test: long test that checks that the complete framework is running correctly
 filterwarnings =
    ignore::DeprecationWarning
+norecursedirs = 
+    *mxnet*
--- a/setup.py
+++ b/setup.py
@@ -85,7 +85,7 @@ extras['all'] = all_deps

 setup(
    name='rl-coach' if not slim_package else 'rl-coach-slim',
-    version='0.12.1',
+    version='1.0.0',
    description='Reinforcement Learning Coach enables easy experimentation with state of the art Reinforcement Learning algorithms.',
    url='https://github.com/NervanaSystems/coach',
    author='Intel AI Lab',