diff --git a/README.md b/README.md index da266fa..72af12c 100644 --- a/README.md +++ b/README.md @@ -30,26 +30,25 @@ coach -p CartPole_DQN -r * [Release 0.10.0](https://ai.intel.com/introducing-reinforcement-learning-coach-0-10-0/) * [Release 0.11.0](https://ai.intel.com/rl-coach-data-science-at-scale) * [Release 0.12.0](https://github.com/NervanaSystems/coach/releases/tag/v0.12.0) -* Release 1.0.0 (current release) +* [Release 1.0.0](https://www.intel.ai/rl-coach-new-release) (current release) -Contacting the Coach development team is also possible over [email](mailto:coach@intel.com) ## Table of Contents -- [Coach](#coach) - * [Benchmarks](#benchmarks) - * [Installation](#installation) - * [Getting Started](#getting-started) - * [Tutorials and Documentation](#tutorials-and-documentation) - * [Basic Usage](#basic-usage) - * [Running Coach](#running-coach) - * [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization) - * [Distributed Multi-Node Coach](#distributed-multi-node-coach) - * [Batch Reinforcement Learning](#batch-reinforcement-learning) - * [Supported Environments](#supported-environments) - * [Supported Algorithms](#supported-algorithms) - * [Citation](#citation) - * [Disclaimer](#disclaimer) +- [Benchmarks](#benchmarks) +- [Installation](#installation) +- [Getting Started](#getting-started) + * [Tutorials and Documentation](#tutorials-and-documentation) + * [Basic Usage](#basic-usage) + * [Running Coach](#running-coach) + * [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization) + * [Distributed Multi-Node Coach](#distributed-multi-node-coach) + * [Batch Reinforcement Learning](#batch-reinforcement-learning) +- [Supported Environments](#supported-environments) +- [Supported Algorithms](#supported-algorithms) +- [Citation](#citation) +- [Contact](#contact) +- [Disclaimer](#disclaimer) ## Benchmarks @@ -289,7 +288,7 @@ There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach * [Generalized Advantage Estimation (GAE)](https://arxiv.org/abs/1506.02438) ([code](rl_coach/agents/actor_critic_agent.py#L86)) * [Sample Efficient Actor-Critic with Experience Replay (ACER)](https://arxiv.org/abs/1611.01224) | **Multi Worker Single Node** ([code](rl_coach/agents/acer_agent.py)) * [Soft Actor-Critic (SAC)](https://arxiv.org/abs/1801.01290) ([code](rl_coach/agents/soft_actor_critic_agent.py)) -* [Twin Delayed Deep Deterministic Policy Gradient](https://arxiv.org/pdf/1802.09477.pdf) ([code](rl_coach/agents/td3_agent.py)) +* [Twin Delayed Deep Deterministic Policy Gradient (TD3)](https://arxiv.org/pdf/1802.09477.pdf) ([code](rl_coach/agents/td3_agent.py)) ### General Agents * [Direct Future Prediction (DFP)](https://arxiv.org/abs/1611.01779) | **Multi Worker Single Node** ([code](rl_coach/agents/dfp_agent.py)) @@ -333,6 +332,15 @@ If you used Coach for your work, please use the following citation: } ``` +## Contact + +We'd be happy to get any questions or contributions through GitHub issues and PRs. + +Please make sure to take a look [here](CONTRIBUTING.md) before filing an issue or proposing a PR. + +The Coach development team can also be contacted over [email](mailto:coach@intel.com) + + ## Disclaimer Coach is released as a reference code for research purposes. It is not an official Intel product, and the level of quality and support may not be as expected from an official product. diff --git a/docs/_sources/index.rst.txt b/docs/_sources/index.rst.txt index 7fb5224..f47ce0b 100644 --- a/docs/_sources/index.rst.txt +++ b/docs/_sources/index.rst.txt @@ -27,7 +27,9 @@ Blog posts from the Intel® AI website: * `Release 0.11.0 `_ -* Release 0.12.0 (current release) +* `Release 0.12.0 `_ + +* `Release 1.0.0 `_ (current release) You can find more details in the `GitHub repository `_. @@ -75,5 +77,3 @@ You can find more details in the `GitHub repository
-prepare_batch_for_inference(states: Union[Dict[str, numpy.ndarray], List[Dict[str, numpy.ndarray]]], network_name: str) → Dict[str, numpy.core.multiarray.array][source]
+prepare_batch_for_inference(states: Union[Dict[str, numpy.ndarray], List[Dict[str, numpy.ndarray]]], network_name: str) → Dict[str, numpy.array][source]

Convert curr_state into input tensors tensorflow is expecting. i.e. if we have several inputs states, stack all observations together, measurements together, etc.

diff --git a/docs/features/algorithms.html b/docs/features/algorithms.html index 6400046..1a800b2 100644 --- a/docs/features/algorithms.html +++ b/docs/features/algorithms.html @@ -95,6 +95,7 @@
  • Algorithms
  • Environments
  • Benchmarks
  • +
  • Batch Reinforcement Learning
  • Selecting an Algorithm
  • diff --git a/docs/features/benchmarks.html b/docs/features/benchmarks.html index 845d42c..073faab 100644 --- a/docs/features/benchmarks.html +++ b/docs/features/benchmarks.html @@ -37,7 +37,7 @@ - + @@ -95,6 +95,7 @@
  • Algorithms
  • Environments
  • Benchmarks
  • +
  • Batch Reinforcement Learning
  • Selecting an Algorithm
  • @@ -220,7 +221,7 @@ benchmarks stay intact as Coach continues to develop.