1
0
mirror of https://github.com/gryf/coach.git synced 2025-12-17 19:20:19 +01:00

Release 1.0.0 (#382)

* Updating README
* Shortening test cycles
This commit is contained in:
Gal Novik
2019-07-24 16:10:58 +03:00
committed by GitHub
parent 718597ce9a
commit 2697142d5a
5 changed files with 46 additions and 38 deletions

View File

@@ -731,18 +731,19 @@ workflows:
- functional_tests:
requires:
- build_base
- functional_test_doom:
requires:
- build_doom_env
- functional_tests
- functional_test_mujoco:
requires:
- build_mujoco_env
- functional_test_doom
# - functional_test_doom:
# requires:
# - build_doom_env
# - functional_tests
# - functional_test_mujoco:
# requires:
# - build_mujoco_env
# - functional_test_doom
- golden_test_gym:
requires:
- build_gym_env
- functional_test_mujoco
# - functional_test_mujoco
- functional_tests
- golden_test_doom:
requires:
- build_doom_env

View File

@@ -54,7 +54,7 @@ Coach is released as two pypi packages:
Each pypi package release has a GitHub release and tag with the same version number. The numbers are of the X.Y.Z format, where
X - zero in the near future, may change when Coach is feature complete
X - currently one, will be incremented on major API changes
Y - major releases with new features

View File

@@ -29,20 +29,23 @@ coach -p CartPole_DQN -r
* [Release 0.9.0](https://ai.intel.com/reinforcement-learning-coach-carla-qr-dqn/)
* [Release 0.10.0](https://ai.intel.com/introducing-reinforcement-learning-coach-0-10-0/)
* [Release 0.11.0](https://ai.intel.com/rl-coach-data-science-at-scale)
* Release 0.12.0 (current release)
* [Release 0.12.0](https://github.com/NervanaSystems/coach/releases/tag/v0.12.0)
* Release 1.0.0 (current release)
Contacting the Coach development team is also possible through the email [coach@intel.com](coach@intel.com)
Contacting the Coach development team is also possible over [email](mailto:coach@intel.com)
## Table of Contents
- [Coach](#coach)
* [Overview](#overview)
* [Benchmarks](#benchmarks)
* [Documentation](#documentation)
* [Installation](#installation)
* [Usage](#usage)
+ [Running Coach](#running-coach)
+ [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization)
* [Getting Started](#getting-started)
* [Tutorials and Documentation](#tutorials-and-documentation)
* [Basic Usage](#basic-usage)
* [Running Coach](#running-coach)
* [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization)
* [Distributed Multi-Node Coach](#distributed-multi-node-coach)
* [Batch Reinforcement Learning](#batch-reinforcement-learning)
* [Supported Environments](#supported-environments)
* [Supported Algorithms](#supported-algorithms)
* [Citation](#citation)
@@ -52,13 +55,6 @@ Contacting the Coach development team is also possible through the email [coach@
One of the main challenges when building a research project, or a solution based on a published algorithm, is getting a concrete and reliable baseline that reproduces the algorithm's results, as reported by its authors. To address this problem, we are releasing a set of [benchmarks](benchmarks) that shows Coach reliably reproduces many state of the art algorithm results.
## Documentation
Framework documentation, algorithm description and instructions on how to contribute a new agent/environment can be found [here](https://nervanasystems.github.io/coach/).
Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment can be found [here](https://github.com/NervanaSystems/coach/tree/master/tutorials).
## Installation
Note: Coach has only been tested on Ubuntu 16.04 LTS, and with Python 3.5.
@@ -113,9 +109,16 @@ If a GPU is present, Coach's pip package will install tensorflow-gpu, by default
In addition to OpenAI Gym, several other environments were tested and are supported. Please follow the instructions in the Supported Environments section below in order to install more environments.
## Usage
## Getting Started
### Running Coach
### Tutorials and Documentation
[Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment](https://github.com/NervanaSystems/coach/tree/master/tutorials).
[Framework documentation, algorithm description and instructions on how to contribute a new agent/environment](https://nervanasystems.github.io/coach/).
### Basic Usage
#### Running Coach
To allow reproducing results in Coach, we defined a mechanism called _preset_.
There are several available presets under the `presets` directory.
@@ -167,17 +170,7 @@ It is easy to create new presets for different levels or environments by followi
More usage examples can be found [here](https://github.com/NervanaSystems/coach/blob/master/tutorials/0.%20Quick%20Start%20Guide.ipynb).
### Distributed Multi-Node Coach
As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents.
For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html).
### Batch Reinforcement Learning
Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach.
There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb).
### Running Coach Dashboard (Visualization)
#### Running Coach Dashboard (Visualization)
Training an agent to solve an environment can be tricky, at times.
In order to debug the training process, Coach outputs several signals, per trained algorithm, in order to track algorithmic performance.
@@ -195,6 +188,17 @@ dashboard
<img src="img/dashboard.gif" alt="Coach Design" style="width: 800px;"/>
### Distributed Multi-Node Coach
As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents.
For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html).
### Batch Reinforcement Learning
Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach.
There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb).
## Supported Environments
* *OpenAI Gym:*
@@ -285,6 +289,7 @@ dashboard
* [Generalized Advantage Estimation (GAE)](https://arxiv.org/abs/1506.02438) ([code](rl_coach/agents/actor_critic_agent.py#L86))
* [Sample Efficient Actor-Critic with Experience Replay (ACER)](https://arxiv.org/abs/1611.01224) | **Multi Worker Single Node** ([code](rl_coach/agents/acer_agent.py))
* [Soft Actor-Critic (SAC)](https://arxiv.org/abs/1801.01290) ([code](rl_coach/agents/soft_actor_critic_agent.py))
* [Twin Delayed Deep Deterministic Policy Gradient](https://arxiv.org/pdf/1802.09477.pdf) ([code](rl_coach/agents/td3_agent.py))
### General Agents
* [Direct Future Prediction (DFP)](https://arxiv.org/abs/1611.01779) | **Multi Worker Single Node** ([code](rl_coach/agents/dfp_agent.py))

View File

@@ -5,3 +5,5 @@ markers =
integration_test: long test that checks that the complete framework is running correctly
filterwarnings =
ignore::DeprecationWarning
norecursedirs =
*mxnet*

View File

@@ -85,7 +85,7 @@ extras['all'] = all_deps
setup(
name='rl-coach' if not slim_package else 'rl-coach-slim',
version='0.12.1',
version='1.0.0',
description='Reinforcement Learning Coach enables easy experimentation with state of the art Reinforcement Learning algorithms.',
url='https://github.com/NervanaSystems/coach',
author='Intel AI Lab',